Overview

Brought to you by YData

Dataset statistics

Number of variables59
Number of observations584592
Missing cells12980198
Missing cells (%)37.6%
Total size in memory263.1 MiB
Average record size in memory472.0 B

Variable types

Text59

Dataset

DescriptionBirds NMNH Extant Specimen Records 0054887-241126133413365
URLhttps://doi.org/10.15468/dl.2en7ue

Alerts

institutionID has constant value "urn:lsid:biocol.org:col:34871" Constant
collectionID has constant value "urn:uuid:73d83e23-1999-42cd-b38a-c06a7d32d893" Constant
institutionCode has constant value "USNM" Constant
collectionCode has constant value "BIRDS" Constant
datasetName has constant value "NMNH Extant Biology" Constant
basisOfRecord has constant value "PreservedSpecimen" Constant
kingdom has constant value "Animalia" Constant
phylum has constant value "Chordata" Constant
class has constant value "Aves" Constant
taxonRank has constant value "subspecies" Constant
recordNumber has 584474 (> 99.9%) missing values Missing
recordedBy has 7123 (1.2%) missing values Missing
lifeStage has 459507 (78.6%) missing values Missing
associatedMedia has 26026 (4.5%) missing values Missing
associatedSequences has 580105 (99.2%) missing values Missing
occurrenceRemarks has 572414 (97.9%) missing values Missing
eventDate has 41253 (7.1%) missing values Missing
startDayOfYear has 55224 (9.4%) missing values Missing
endDayOfYear has 55046 (9.4%) missing values Missing
year has 41253 (7.1%) missing values Missing
month has 53401 (9.1%) missing values Missing
day has 74067 (12.7%) missing values Missing
verbatimEventDate has 235442 (40.3%) missing values Missing
habitat has 567355 (97.1%) missing values Missing
continent has 12727 (2.2%) missing values Missing
waterBody has 558515 (95.5%) missing values Missing
stateProvince has 93871 (16.1%) missing values Missing
county has 353572 (60.5%) missing values Missing
locality has 107551 (18.4%) missing values Missing
minimumElevationInMeters has 498025 (85.2%) missing values Missing
maximumElevationInMeters has 574727 (98.3%) missing values Missing
verbatimElevation has 583323 (99.8%) missing values Missing
decimalLatitude has 556582 (95.2%) missing values Missing
decimalLongitude has 556582 (95.2%) missing values Missing
geodeticDatum has 584234 (99.9%) missing values Missing
verbatimLatitude has 561806 (96.1%) missing values Missing
verbatimLongitude has 562895 (96.3%) missing values Missing
verbatimCoordinateSystem has 567281 (97.0%) missing values Missing
georeferenceProtocol has 583342 (99.8%) missing values Missing
identificationQualifier has 583894 (99.9%) missing values Missing
typeStatus has 580614 (99.3%) missing values Missing
identifiedBy has 581206 (99.4%) missing values Missing
infraspecificEpithet has 268369 (45.9%) missing values Missing
taxonRank has 268369 (45.9%) missing values Missing
scientificNameAuthorship has 583452 (99.8%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique
catalogNumber has unique values Unique

Reproduction

Analysis started2025-01-14 16:49:18.421951
Analysis finished2025-01-14 16:49:34.371745
Duration15.95 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct584592
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:34.775494image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters5845920
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique584592 ?
Unique (%)100.0%

Sample

1st row4601228301
2nd row1317203661
3rd row1322538154
4th row1317205864
5th row1317207704
ValueCountFrequency (%)
4601228301 1
 
< 0.1%
1322540164 1
 
< 0.1%
1322550508 1
 
< 0.1%
1317268099 1
 
< 0.1%
1317208553 1
 
< 0.1%
1322538154 1
 
< 0.1%
1317205864 1
 
< 0.1%
1317207704 1
 
< 0.1%
1317208071 1
 
< 0.1%
1317232225 1
 
< 0.1%
Other values (584582) 584582
> 99.9%
2025-01-14T11:49:35.252619image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1203296
20.6%
3 890257
15.2%
2 755814
12.9%
9 472136
 
8.1%
0 451322
 
7.7%
8 446725
 
7.6%
7 431175
 
7.4%
5 410207
 
7.0%
4 403288
 
6.9%
6 381700
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5845920
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1203296
20.6%
3 890257
15.2%
2 755814
12.9%
9 472136
 
8.1%
0 451322
 
7.7%
8 446725
 
7.6%
7 431175
 
7.4%
5 410207
 
7.0%
4 403288
 
6.9%
6 381700
 
6.5%

Most occurring scripts

ValueCountFrequency (%)
Common 5845920
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1203296
20.6%
3 890257
15.2%
2 755814
12.9%
9 472136
 
8.1%
0 451322
 
7.7%
8 446725
 
7.6%
7 431175
 
7.4%
5 410207
 
7.0%
4 403288
 
6.9%
6 381700
 
6.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5845920
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1203296
20.6%
3 890257
15.2%
2 755814
12.9%
9 472136
 
8.1%
0 451322
 
7.7%
8 446725
 
7.6%
7 431175
 
7.4%
5 410207
 
7.0%
4 403288
 
6.9%
6 381700
 
6.5%
Distinct11792
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:35.447782image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters11107248
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4737 ?
Unique (%)0.8%

Sample

1st row2024-03-26 12:49:00
2nd row2022-07-12 14:29:00
3rd row2022-04-29 16:16:00
4th row2022-04-05 14:20:00
5th row2022-09-22 21:27:00
ValueCountFrequency (%)
2022-09-22 268565
 
23.0%
2024-09-19 30525
 
2.6%
2022-04-07 28741
 
2.5%
2022-04-11 25217
 
2.2%
2022-04-12 24013
 
2.1%
2022-05-02 16128
 
1.4%
2022-04-29 15031
 
1.3%
2022-06-29 10897
 
0.9%
2022-04-18 9718
 
0.8%
2022-06-08 9587
 
0.8%
Other values (1533) 730762
62.5%
2025-01-14T11:49:35.693591image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 2854206
25.7%
0 2776194
25.0%
- 1169184
10.5%
: 1169184
10.5%
1 800219
 
7.2%
584592
 
5.3%
9 465374
 
4.2%
4 411253
 
3.7%
5 256891
 
2.3%
3 228479
 
2.1%
Other values (3) 391672
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8184288
73.7%
Dash Punctuation 1169184
 
10.5%
Other Punctuation 1169184
 
10.5%
Space Separator 584592
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 2854206
34.9%
0 2776194
33.9%
1 800219
 
9.8%
9 465374
 
5.7%
4 411253
 
5.0%
5 256891
 
3.1%
3 228479
 
2.8%
7 147850
 
1.8%
8 125685
 
1.5%
6 118137
 
1.4%
Dash Punctuation
ValueCountFrequency (%)
- 1169184
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1169184
100.0%
Space Separator
ValueCountFrequency (%)
584592
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 11107248
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 2854206
25.7%
0 2776194
25.0%
- 1169184
10.5%
: 1169184
10.5%
1 800219
 
7.2%
584592
 
5.3%
9 465374
 
4.2%
4 411253
 
3.7%
5 256891
 
2.3%
3 228479
 
2.1%
Other values (3) 391672
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11107248
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 2854206
25.7%
0 2776194
25.0%
- 1169184
10.5%
: 1169184
10.5%
1 800219
 
7.2%
584592
 
5.3%
9 465374
 
4.2%
4 411253
 
3.7%
5 256891
 
2.3%
3 228479
 
2.1%
Other values (3) 391672
 
3.5%

institutionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:35.761105image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters16953168
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:lsid:biocol.org:col:34871
2nd rowurn:lsid:biocol.org:col:34871
3rd rowurn:lsid:biocol.org:col:34871
4th rowurn:lsid:biocol.org:col:34871
5th rowurn:lsid:biocol.org:col:34871
ValueCountFrequency (%)
urn:lsid:biocol.org:col:34871 584592
100.0%
2025-01-14T11:49:35.863584image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 2338368
13.8%
: 2338368
13.8%
l 1753776
 
10.3%
i 1169184
 
6.9%
r 1169184
 
6.9%
c 1169184
 
6.9%
g 584592
 
3.4%
7 584592
 
3.4%
8 584592
 
3.4%
4 584592
 
3.4%
Other values (8) 4676736
27.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11107248
65.5%
Other Punctuation 2922960
 
17.2%
Decimal Number 2922960
 
17.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 2338368
21.1%
l 1753776
15.8%
i 1169184
10.5%
r 1169184
10.5%
c 1169184
10.5%
g 584592
 
5.3%
u 584592
 
5.3%
b 584592
 
5.3%
d 584592
 
5.3%
s 584592
 
5.3%
Decimal Number
ValueCountFrequency (%)
7 584592
20.0%
8 584592
20.0%
4 584592
20.0%
3 584592
20.0%
1 584592
20.0%
Other Punctuation
ValueCountFrequency (%)
: 2338368
80.0%
. 584592
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11107248
65.5%
Common 5845920
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2338368
21.1%
l 1753776
15.8%
i 1169184
10.5%
r 1169184
10.5%
c 1169184
10.5%
g 584592
 
5.3%
u 584592
 
5.3%
b 584592
 
5.3%
d 584592
 
5.3%
s 584592
 
5.3%
Common
ValueCountFrequency (%)
: 2338368
40.0%
7 584592
 
10.0%
8 584592
 
10.0%
4 584592
 
10.0%
3 584592
 
10.0%
. 584592
 
10.0%
1 584592
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16953168
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 2338368
13.8%
: 2338368
13.8%
l 1753776
 
10.3%
i 1169184
 
6.9%
r 1169184
 
6.9%
c 1169184
 
6.9%
g 584592
 
3.4%
7 584592
 
3.4%
8 584592
 
3.4%
4 584592
 
3.4%
Other values (8) 4676736
27.6%

collectionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:36.062799image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters26306640
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:uuid:73d83e23-1999-42cd-b38a-c06a7d32d893
2nd rowurn:uuid:73d83e23-1999-42cd-b38a-c06a7d32d893
3rd rowurn:uuid:73d83e23-1999-42cd-b38a-c06a7d32d893
4th rowurn:uuid:73d83e23-1999-42cd-b38a-c06a7d32d893
5th rowurn:uuid:73d83e23-1999-42cd-b38a-c06a7d32d893
ValueCountFrequency (%)
urn:uuid:73d83e23-1999-42cd-b38a-c06a7d32d893 584592
100.0%
2025-01-14T11:49:36.174535image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3 3507552
13.3%
d 2922960
11.1%
9 2338368
 
8.9%
- 2338368
 
8.9%
u 1753776
 
6.7%
8 1753776
 
6.7%
2 1753776
 
6.7%
7 1169184
 
4.4%
: 1169184
 
4.4%
c 1169184
 
4.4%
Other values (10) 6430512
24.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 12861024
48.9%
Lowercase Letter 9938064
37.8%
Dash Punctuation 2338368
 
8.9%
Other Punctuation 1169184
 
4.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 3507552
27.3%
9 2338368
18.2%
8 1753776
13.6%
2 1753776
13.6%
7 1169184
 
9.1%
1 584592
 
4.5%
4 584592
 
4.5%
0 584592
 
4.5%
6 584592
 
4.5%
Lowercase Letter
ValueCountFrequency (%)
d 2922960
29.4%
u 1753776
17.6%
c 1169184
 
11.8%
a 1169184
 
11.8%
i 584592
 
5.9%
e 584592
 
5.9%
r 584592
 
5.9%
n 584592
 
5.9%
b 584592
 
5.9%
Dash Punctuation
ValueCountFrequency (%)
- 2338368
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1169184
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 16368576
62.2%
Latin 9938064
37.8%

Most frequent character per script

Common
ValueCountFrequency (%)
3 3507552
21.4%
9 2338368
14.3%
- 2338368
14.3%
8 1753776
10.7%
2 1753776
10.7%
7 1169184
 
7.1%
: 1169184
 
7.1%
1 584592
 
3.6%
4 584592
 
3.6%
0 584592
 
3.6%
Latin
ValueCountFrequency (%)
d 2922960
29.4%
u 1753776
17.6%
c 1169184
 
11.8%
a 1169184
 
11.8%
i 584592
 
5.9%
e 584592
 
5.9%
r 584592
 
5.9%
n 584592
 
5.9%
b 584592
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26306640
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3 3507552
13.3%
d 2922960
11.1%
9 2338368
 
8.9%
- 2338368
 
8.9%
u 1753776
 
6.7%
8 1753776
 
6.7%
2 1753776
 
6.7%
7 1169184
 
4.4%
: 1169184
 
4.4%
c 1169184
 
4.4%
Other values (10) 6430512
24.4%

institutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:36.216055image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2338368
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUSNM
2nd rowUSNM
3rd rowUSNM
4th rowUSNM
5th rowUSNM
ValueCountFrequency (%)
usnm 584592
100.0%
2025-01-14T11:49:36.315578image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 584592
25.0%
S 584592
25.0%
N 584592
25.0%
M 584592
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2338368
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 584592
25.0%
S 584592
25.0%
N 584592
25.0%
M 584592
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2338368
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 584592
25.0%
S 584592
25.0%
N 584592
25.0%
M 584592
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2338368
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 584592
25.0%
S 584592
25.0%
N 584592
25.0%
M 584592
25.0%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:36.356437image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters2922960
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBIRDS
2nd rowBIRDS
3rd rowBIRDS
4th rowBIRDS
5th rowBIRDS
ValueCountFrequency (%)
birds 584592
100.0%
2025-01-14T11:49:36.450646image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
B 584592
20.0%
I 584592
20.0%
R 584592
20.0%
D 584592
20.0%
S 584592
20.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2922960
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
B 584592
20.0%
I 584592
20.0%
R 584592
20.0%
D 584592
20.0%
S 584592
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2922960
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
B 584592
20.0%
I 584592
20.0%
R 584592
20.0%
D 584592
20.0%
S 584592
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2922960
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
B 584592
20.0%
I 584592
20.0%
R 584592
20.0%
D 584592
20.0%
S 584592
20.0%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:36.493661image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters11107248
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNMNH Extant Biology
2nd rowNMNH Extant Biology
3rd rowNMNH Extant Biology
4th rowNMNH Extant Biology
5th rowNMNH Extant Biology
ValueCountFrequency (%)
nmnh 584592
33.3%
extant 584592
33.3%
biology 584592
33.3%
2025-01-14T11:49:36.599166image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 1169184
 
10.5%
1169184
 
10.5%
t 1169184
 
10.5%
o 1169184
 
10.5%
M 584592
 
5.3%
H 584592
 
5.3%
E 584592
 
5.3%
x 584592
 
5.3%
a 584592
 
5.3%
n 584592
 
5.3%
Other values (5) 2922960
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6430512
57.9%
Uppercase Letter 3507552
31.6%
Space Separator 1169184
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1169184
18.2%
o 1169184
18.2%
x 584592
9.1%
a 584592
9.1%
n 584592
9.1%
i 584592
9.1%
l 584592
9.1%
g 584592
9.1%
y 584592
9.1%
Uppercase Letter
ValueCountFrequency (%)
N 1169184
33.3%
M 584592
16.7%
H 584592
16.7%
E 584592
16.7%
B 584592
16.7%
Space Separator
ValueCountFrequency (%)
1169184
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9938064
89.5%
Common 1169184
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 1169184
11.8%
t 1169184
11.8%
o 1169184
11.8%
M 584592
 
5.9%
H 584592
 
5.9%
E 584592
 
5.9%
x 584592
 
5.9%
a 584592
 
5.9%
n 584592
 
5.9%
B 584592
 
5.9%
Other values (4) 2338368
23.5%
Common
ValueCountFrequency (%)
1169184
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11107248
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1169184
 
10.5%
1169184
 
10.5%
t 1169184
 
10.5%
o 1169184
 
10.5%
M 584592
 
5.3%
H 584592
 
5.3%
E 584592
 
5.3%
x 584592
 
5.3%
a 584592
 
5.3%
n 584592
 
5.3%
Other values (5) 2922960
26.3%

basisOfRecord
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:36.656138image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length17
Mean length17
Min length17

Characters and Unicode

Total characters9938064
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPreservedSpecimen
2nd rowPreservedSpecimen
3rd rowPreservedSpecimen
4th rowPreservedSpecimen
5th rowPreservedSpecimen
ValueCountFrequency (%)
preservedspecimen 584592
100.0%
2025-01-14T11:49:36.762818image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2922960
29.4%
r 1169184
 
11.8%
P 584592
 
5.9%
s 584592
 
5.9%
v 584592
 
5.9%
d 584592
 
5.9%
S 584592
 
5.9%
p 584592
 
5.9%
c 584592
 
5.9%
i 584592
 
5.9%
Other values (2) 1169184
 
11.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8768880
88.2%
Uppercase Letter 1169184
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2922960
33.3%
r 1169184
 
13.3%
s 584592
 
6.7%
v 584592
 
6.7%
d 584592
 
6.7%
p 584592
 
6.7%
c 584592
 
6.7%
i 584592
 
6.7%
m 584592
 
6.7%
n 584592
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
P 584592
50.0%
S 584592
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9938064
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2922960
29.4%
r 1169184
 
11.8%
P 584592
 
5.9%
s 584592
 
5.9%
v 584592
 
5.9%
d 584592
 
5.9%
S 584592
 
5.9%
p 584592
 
5.9%
c 584592
 
5.9%
i 584592
 
5.9%
Other values (2) 1169184
 
11.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9938064
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2922960
29.4%
r 1169184
 
11.8%
P 584592
 
5.9%
s 584592
 
5.9%
v 584592
 
5.9%
d 584592
 
5.9%
S 584592
 
5.9%
p 584592
 
5.9%
c 584592
 
5.9%
i 584592
 
5.9%
Other values (2) 1169184
 
11.8%

occurrenceID
Text

Unique 

Distinct584592
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:37.124080image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length63
Mean length63
Min length63

Characters and Unicode

Total characters36829296
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique584592 ?
Unique (%)100.0%

Sample

1st rowhttp://n2t.net/ark:/65665/300075fa7-edd1-461a-9f08-e6ba501db28c
2nd rowhttp://n2t.net/ark:/65665/3000df15d-8cee-4e97-92ce-bb2a2eabd590
3rd rowhttp://n2t.net/ark:/65665/3ec08151f-42be-49b5-868b-d3deeddbd447
4th rowhttp://n2t.net/ark:/65665/30026d668-b659-45a3-8494-25f389913e98
5th rowhttp://n2t.net/ark:/65665/3003b6dd3-df37-400f-8ae6-e515ea9c2d04
ValueCountFrequency (%)
http://n2t.net/ark:/65665/300075fa7-edd1-461a-9f08-e6ba501db28c 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec1dbc05-3709-4356-a820-34fb80d5a314 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec937490-e545-4db6-812d-bbcfe6057996 1
 
< 0.1%
http://n2t.net/ark:/65665/302e7b9b3-e03c-4d08-a4a5-3110143884c6 1
 
< 0.1%
http://n2t.net/ark:/65665/3004420cd-5dd8-4d0b-bb81-5df504988ccf 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec08151f-42be-49b5-868b-d3deeddbd447 1
 
< 0.1%
http://n2t.net/ark:/65665/30026d668-b659-45a3-8494-25f389913e98 1
 
< 0.1%
http://n2t.net/ark:/65665/3003b6dd3-df37-400f-8ae6-e515ea9c2d04 1
 
< 0.1%
http://n2t.net/ark:/65665/3003f1ccb-ef9c-4862-9369-5c82ac27e83e 1
 
< 0.1%
http://n2t.net/ark:/65665/30150f58d-26d0-475b-b905-a9bb8e072667 1
 
< 0.1%
Other values (584582) 584582
> 99.9%
2025-01-14T11:49:37.520085image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 2922960
 
7.9%
6 2852045
 
7.7%
- 2338368
 
6.3%
t 2338368
 
6.3%
5 2265649
 
6.2%
a 1827322
 
5.0%
2 1681511
 
4.6%
3 1680550
 
4.6%
e 1680227
 
4.6%
4 1679992
 
4.6%
Other values (16) 15562304
42.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 15931516
43.3%
Lowercase Letter 13882676
37.7%
Other Punctuation 4676736
 
12.7%
Dash Punctuation 2338368
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 2338368
16.8%
a 1827322
13.2%
e 1680227
12.1%
b 1241614
8.9%
n 1169184
8.4%
c 1097145
7.9%
f 1096769
7.9%
d 1093679
7.9%
k 584592
 
4.2%
r 584592
 
4.2%
Other values (2) 1169184
8.4%
Decimal Number
ValueCountFrequency (%)
6 2852045
17.9%
5 2265649
14.2%
2 1681511
10.6%
3 1680550
10.5%
4 1679992
10.5%
8 1243182
7.8%
9 1240698
7.8%
1 1096282
 
6.9%
7 1096019
 
6.9%
0 1095588
 
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 2922960
62.5%
: 1169184
 
25.0%
. 584592
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 2338368
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 22946620
62.3%
Latin 13882676
37.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 2922960
12.7%
6 2852045
12.4%
- 2338368
10.2%
5 2265649
9.9%
2 1681511
7.3%
3 1680550
7.3%
4 1679992
7.3%
8 1243182
 
5.4%
9 1240698
 
5.4%
: 1169184
 
5.1%
Other values (4) 3872481
16.9%
Latin
ValueCountFrequency (%)
t 2338368
16.8%
a 1827322
13.2%
e 1680227
12.1%
b 1241614
8.9%
n 1169184
8.4%
c 1097145
7.9%
f 1096769
7.9%
d 1093679
7.9%
k 584592
 
4.2%
r 584592
 
4.2%
Other values (2) 1169184
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36829296
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 2922960
 
7.9%
6 2852045
 
7.7%
- 2338368
 
6.3%
t 2338368
 
6.3%
5 2265649
 
6.2%
a 1827322
 
5.0%
2 1681511
 
4.6%
3 1680550
 
4.6%
e 1680227
 
4.6%
4 1679992
 
4.6%
Other values (16) 15562304
42.3%

catalogNumber
Text

Unique 

Distinct584592
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:37.947764image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length11
Mean length10.92067972
Min length6

Characters and Unicode

Total characters6384142
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique584592 ?
Unique (%)100.0%

Sample

1st rowUSNM A16396
2nd rowUSNM 101402
3rd rowUSNM B28085
4th rowUSNM 289875
5th rowUSNM 562118
ValueCountFrequency (%)
usnm 584592
50.0%
438818 1
 
< 0.1%
160226 1
 
< 0.1%
540920 1
 
< 0.1%
400497 1
 
< 0.1%
b28085 1
 
< 0.1%
289875 1
 
< 0.1%
562118 1
 
< 0.1%
b42715 1
 
< 0.1%
378552 1
 
< 0.1%
Other values (584583) 584583
50.0%
2025-01-14T11:49:38.473284image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 584592
 
9.2%
S 584592
 
9.2%
N 584592
 
9.2%
M 584592
 
9.2%
584592
 
9.2%
3 396623
 
6.2%
4 396155
 
6.2%
5 388165
 
6.1%
1 387443
 
6.1%
2 382727
 
6.0%
Other values (7) 1510069
23.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3420292
53.6%
Uppercase Letter 2379258
37.3%
Space Separator 584592
 
9.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 396623
11.6%
4 396155
11.6%
5 388165
11.3%
1 387443
11.3%
2 382727
11.2%
6 326899
9.6%
0 287859
8.4%
9 286088
8.4%
8 284189
8.3%
7 284144
8.3%
Uppercase Letter
ValueCountFrequency (%)
U 584592
24.6%
S 584592
24.6%
N 584592
24.6%
M 584592
24.6%
B 34602
 
1.5%
A 6288
 
0.3%
Space Separator
ValueCountFrequency (%)
584592
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4004884
62.7%
Latin 2379258
37.3%

Most frequent character per script

Common
ValueCountFrequency (%)
584592
14.6%
3 396623
9.9%
4 396155
9.9%
5 388165
9.7%
1 387443
9.7%
2 382727
9.6%
6 326899
8.2%
0 287859
7.2%
9 286088
7.1%
8 284189
7.1%
Latin
ValueCountFrequency (%)
U 584592
24.6%
S 584592
24.6%
N 584592
24.6%
M 584592
24.6%
B 34602
 
1.5%
A 6288
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6384142
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 584592
 
9.2%
S 584592
 
9.2%
N 584592
 
9.2%
M 584592
 
9.2%
584592
 
9.2%
3 396623
 
6.2%
4 396155
 
6.2%
5 388165
 
6.1%
1 387443
 
6.1%
2 382727
 
6.0%
Other values (7) 1510069
23.7%

recordNumber
Text

Missing 

Distinct4
Distinct (%)3.4%
Missing584474
Missing (%)> 99.9%
Memory size4.5 MiB
2025-01-14T11:49:38.534313image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length1
Mean length1.059322034
Min length1

Characters and Unicode

Total characters125
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)2.5%

Sample

1st rowl
2nd rowl
3rd rowdu
4th rowl
5th rowl
ValueCountFrequency (%)
l 115
97.5%
du 1
 
0.8%
riley 1
 
0.8%
sta 1
 
0.8%
2025-01-14T11:49:38.642749image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 116
92.8%
d 1
 
0.8%
u 1
 
0.8%
r 1
 
0.8%
i 1
 
0.8%
e 1
 
0.8%
y 1
 
0.8%
s 1
 
0.8%
t 1
 
0.8%
a 1
 
0.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 125
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 116
92.8%
d 1
 
0.8%
u 1
 
0.8%
r 1
 
0.8%
i 1
 
0.8%
e 1
 
0.8%
y 1
 
0.8%
s 1
 
0.8%
t 1
 
0.8%
a 1
 
0.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 125
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 116
92.8%
d 1
 
0.8%
u 1
 
0.8%
r 1
 
0.8%
i 1
 
0.8%
e 1
 
0.8%
y 1
 
0.8%
s 1
 
0.8%
t 1
 
0.8%
a 1
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 125
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 116
92.8%
d 1
 
0.8%
u 1
 
0.8%
r 1
 
0.8%
i 1
 
0.8%
e 1
 
0.8%
y 1
 
0.8%
s 1
 
0.8%
t 1
 
0.8%
a 1
 
0.8%

recordedBy
Text

Missing 

Distinct13250
Distinct (%)2.3%
Missing7123
Missing (%)1.2%
Memory size4.5 MiB
2025-01-14T11:49:38.831067image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length60
Median length55
Mean length11.76426613
Min length1

Characters and Unicode

Total characters6793499
Distinct characters65
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6170 ?
Unique (%)1.1%

Sample

1st rowT. Page
2nd rowC. Worthen
3rd rowH. Lee
4th rowC. Sperry
5th rowC. Ross
ValueCountFrequency (%)
a 64567
 
4.8%
j 60293
 
4.5%
e 58464
 
4.4%
56508
 
4.2%
w 52970
 
4.0%
h 41937
 
3.1%
m 37812
 
2.8%
c 37330
 
2.8%
t 32505
 
2.4%
wetmore 32367
 
2.4%
Other values (7402) 863863
64.5%
2025-01-14T11:49:39.103341image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
761147
 
11.2%
. 558992
 
8.2%
e 547336
 
8.1%
r 485535
 
7.1%
o 389498
 
5.7%
n 353948
 
5.2%
a 303496
 
4.5%
l 299899
 
4.4%
i 264364
 
3.9%
t 245352
 
3.6%
Other values (55) 2583932
38.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4115057
60.6%
Uppercase Letter 1287488
 
19.0%
Space Separator 761147
 
11.2%
Other Punctuation 622658
 
9.2%
Dash Punctuation 3521
 
0.1%
Decimal Number 2824
 
< 0.1%
Open Punctuation 402
 
< 0.1%
Close Punctuation 402
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 547336
13.3%
r 485535
11.8%
o 389498
9.5%
n 353948
 
8.6%
a 303496
 
7.4%
l 299899
 
7.3%
i 264364
 
6.4%
t 245352
 
6.0%
s 161797
 
3.9%
c 132938
 
3.2%
Other values (16) 930894
22.6%
Uppercase Letter
ValueCountFrequency (%)
W 117570
 
9.1%
C 117514
 
9.1%
B 99641
 
7.7%
A 96140
 
7.5%
M 90141
 
7.0%
H 82765
 
6.4%
R 77734
 
6.0%
P 76833
 
6.0%
J 69213
 
5.4%
S 67116
 
5.2%
Other values (16) 392821
30.5%
Other Punctuation
ValueCountFrequency (%)
. 558992
89.8%
& 56427
 
9.1%
, 6619
 
1.1%
' 606
 
0.1%
? 13
 
< 0.1%
/ 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
9 1412
50.0%
1 708
25.1%
8 704
24.9%
Space Separator
ValueCountFrequency (%)
761147
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3521
100.0%
Open Punctuation
ValueCountFrequency (%)
( 402
100.0%
Close Punctuation
ValueCountFrequency (%)
) 402
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5402545
79.5%
Common 1390954
 
20.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 547336
 
10.1%
r 485535
 
9.0%
o 389498
 
7.2%
n 353948
 
6.6%
a 303496
 
5.6%
l 299899
 
5.6%
i 264364
 
4.9%
t 245352
 
4.5%
s 161797
 
3.0%
c 132938
 
2.5%
Other values (42) 2218382
41.1%
Common
ValueCountFrequency (%)
761147
54.7%
. 558992
40.2%
& 56427
 
4.1%
, 6619
 
0.5%
- 3521
 
0.3%
9 1412
 
0.1%
1 708
 
0.1%
8 704
 
0.1%
' 606
 
< 0.1%
( 402
 
< 0.1%
Other values (3) 416
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6793499
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
761147
 
11.2%
. 558992
 
8.2%
e 547336
 
8.1%
r 485535
 
7.1%
o 389498
 
5.7%
n 353948
 
5.2%
a 303496
 
4.5%
l 299899
 
4.4%
i 264364
 
3.9%
t 245352
 
3.6%
Other values (55) 2583932
38.0%
Distinct49
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:39.164734image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length1
Mean length1.001168336
Min length1

Characters and Unicode

Total characters585275
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row4
4th row1
5th row1
ValueCountFrequency (%)
1 558309
95.5%
2 6799
 
1.2%
4 6794
 
1.2%
3 6435
 
1.1%
5 3136
 
0.5%
6 1204
 
0.2%
7 608
 
0.1%
8 374
 
0.1%
9 251
 
< 0.1%
10 169
 
< 0.1%
Other values (39) 513
 
0.1%
2025-01-14T11:49:39.282948image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 559052
95.5%
2 6944
 
1.2%
4 6855
 
1.2%
3 6513
 
1.1%
5 3191
 
0.5%
6 1239
 
0.2%
7 631
 
0.1%
8 397
 
0.1%
9 270
 
< 0.1%
0 183
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 585275
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 559052
95.5%
2 6944
 
1.2%
4 6855
 
1.2%
3 6513
 
1.1%
5 3191
 
0.5%
6 1239
 
0.2%
7 631
 
0.1%
8 397
 
0.1%
9 270
 
< 0.1%
0 183
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 585275
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 559052
95.5%
2 6944
 
1.2%
4 6855
 
1.2%
3 6513
 
1.1%
5 3191
 
0.5%
6 1239
 
0.2%
7 631
 
0.1%
8 397
 
0.1%
9 270
 
< 0.1%
0 183
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 585275
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 559052
95.5%
2 6944
 
1.2%
4 6855
 
1.2%
3 6513
 
1.1%
5 3191
 
0.5%
6 1239
 
0.2%
7 631
 
0.1%
8 397
 
0.1%
9 270
 
< 0.1%
0 183
 
< 0.1%

sex
Text

Distinct6
Distinct (%)< 0.1%
Missing2
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-14T11:49:39.325810image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.236907918
Min length4

Characters and Unicode

Total characters3061444
Distinct characters15
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMale
2nd rowFemale
3rd rowunknown
4th rowMale
5th rowMale
ValueCountFrequency (%)
male 279199
47.8%
female 193089
33.0%
unknown 112302
19.2%
2025-01-14T11:49:39.427291image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 664157
21.7%
a 471424
15.4%
l 471424
15.4%
n 336906
11.0%
M 279555
9.1%
F 193089
 
6.3%
m 192733
 
6.3%
k 112302
 
3.7%
o 112302
 
3.7%
w 112302
 
3.7%
Other values (5) 115250
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2558187
83.6%
Uppercase Letter 503257
 
16.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 664157
26.0%
a 471424
18.4%
l 471424
18.4%
n 336906
13.2%
m 192733
 
7.5%
k 112302
 
4.4%
o 112302
 
4.4%
w 112302
 
4.4%
u 84637
 
3.3%
Uppercase Letter
ValueCountFrequency (%)
M 279555
55.5%
F 193089
38.4%
U 27665
 
5.5%
E 1220
 
0.2%
A 864
 
0.2%
L 864
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 3061444
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 664157
21.7%
a 471424
15.4%
l 471424
15.4%
n 336906
11.0%
M 279555
9.1%
F 193089
 
6.3%
m 192733
 
6.3%
k 112302
 
3.7%
o 112302
 
3.7%
w 112302
 
3.7%
Other values (5) 115250
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3061444
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 664157
21.7%
a 471424
15.4%
l 471424
15.4%
n 336906
11.0%
M 279555
9.1%
F 193089
 
6.3%
m 192733
 
6.3%
k 112302
 
3.7%
o 112302
 
3.7%
w 112302
 
3.7%
Other values (5) 115250
 
3.8%

lifeStage
Text

Missing 

Distinct14
Distinct (%)< 0.1%
Missing459507
Missing (%)78.6%
Memory size4.5 MiB
2025-01-14T11:49:39.475881image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length5
Mean length5.960986529
Min length5

Characters and Unicode

Total characters745630
Distinct characters27
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st rowImmature
2nd rowJuvenile
3rd rowAdult
4th rowAdult
5th rowAdult
ValueCountFrequency (%)
adult 81111
64.8%
immature 27828
 
22.2%
juvenile 10760
 
8.6%
chick 3709
 
3.0%
subadult 1382
 
1.1%
embryo 292
 
0.2%
young 2
 
< 0.1%
nestling 1
 
< 0.1%
2025-01-14T11:49:39.578886image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
u 122465
16.4%
t 110322
14.8%
l 93254
12.5%
d 82493
11.1%
A 81087
10.9%
m 55948
7.5%
e 49350
6.6%
a 29234
 
3.9%
r 28120
 
3.8%
I 27825
 
3.7%
Other values (17) 65532
8.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 620594
83.2%
Uppercase Letter 125036
 
16.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 122465
19.7%
t 110322
17.8%
l 93254
15.0%
d 82493
13.3%
m 55948
9.0%
e 49350
8.0%
a 29234
 
4.7%
r 28120
 
4.5%
i 14473
 
2.3%
n 10764
 
1.7%
Other values (10) 24171
 
3.9%
Uppercase Letter
ValueCountFrequency (%)
A 81087
64.9%
I 27825
 
22.3%
J 10753
 
8.6%
C 3697
 
3.0%
S 1381
 
1.1%
E 291
 
0.2%
Y 2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 745630
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 122465
16.4%
t 110322
14.8%
l 93254
12.5%
d 82493
11.1%
A 81087
10.9%
m 55948
7.5%
e 49350
6.6%
a 29234
 
3.9%
r 28120
 
3.8%
I 27825
 
3.7%
Other values (17) 65532
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 745630
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
u 122465
16.4%
t 110322
14.8%
l 93254
12.5%
d 82493
11.1%
A 81087
10.9%
m 55948
7.5%
e 49350
6.6%
a 29234
 
3.9%
r 28120
 
3.8%
I 27825
 
3.7%
Other values (17) 65532
8.8%
Distinct132
Distinct (%)< 0.1%
Missing6
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-14T11:49:39.631405image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length76
Median length11
Mean length11.71096126
Min length6

Characters and Unicode

Total characters6846064
Distinct characters31
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39 ?
Unique (%)< 0.1%

Sample

1st rowSkin: Whole
2nd rowSkin: Whole
3rd rowEgg(s)
4th rowSkeleton: Whole
5th rowSkeleton: Whole
ValueCountFrequency (%)
whole 535339
45.8%
skin 470355
40.2%
skeleton 58626
 
5.0%
egg(s 33064
 
2.8%
fluid 32579
 
2.8%
partial 24616
 
2.1%
nest(s 4794
 
0.4%
feather(s 4784
 
0.4%
mounted 1952
 
0.2%
clutch 967
 
0.1%
Other values (7) 2530
 
0.2%
2025-01-14T11:49:39.756596image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 671417
9.8%
l 654016
9.6%
o 595917
8.7%
585020
8.5%
: 562892
8.2%
h 541090
7.9%
W 535338
7.8%
n 532123
7.8%
i 529352
7.7%
S 529335
7.7%
Other values (21) 1109564
16.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4423169
64.6%
Uppercase Letter 1169248
 
17.1%
Space Separator 585020
 
8.5%
Other Punctuation 583343
 
8.5%
Open Punctuation 42642
 
0.6%
Close Punctuation 42642
 
0.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 671417
15.2%
l 654016
14.8%
o 595917
13.5%
h 541090
12.2%
n 532123
12.0%
i 529352
12.0%
k 528981
12.0%
t 96879
 
2.2%
g 66128
 
1.5%
a 55099
 
1.2%
Other values (8) 152167
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
W 535338
45.8%
S 529335
45.3%
F 37363
 
3.2%
E 33064
 
2.8%
P 24615
 
2.1%
N 4794
 
0.4%
M 3399
 
0.3%
C 1340
 
0.1%
Other Punctuation
ValueCountFrequency (%)
: 562892
96.5%
; 20451
 
3.5%
Space Separator
ValueCountFrequency (%)
585020
100.0%
Open Punctuation
ValueCountFrequency (%)
( 42642
100.0%
Close Punctuation
ValueCountFrequency (%)
) 42642
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5592417
81.7%
Common 1253647
 
18.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 671417
12.0%
l 654016
11.7%
o 595917
10.7%
h 541090
9.7%
W 535338
9.6%
n 532123
9.5%
i 529352
9.5%
S 529335
9.5%
k 528981
9.5%
t 96879
 
1.7%
Other values (16) 377969
6.8%
Common
ValueCountFrequency (%)
585020
46.7%
: 562892
44.9%
( 42642
 
3.4%
) 42642
 
3.4%
; 20451
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6846064
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 671417
9.8%
l 654016
9.6%
o 595917
8.7%
585020
8.5%
: 562892
8.2%
h 541090
7.9%
W 535338
7.8%
n 532123
7.8%
i 529352
7.7%
S 529335
7.7%
Other values (21) 1109564
16.2%

associatedMedia
Text

Missing 

Distinct37503
Distinct (%)6.7%
Missing26026
Missing (%)4.5%
Memory size4.5 MiB
2025-01-14T11:49:39.882673image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1099
Median length49
Mean length49.90741291
Min length48

Characters and Unicode

Total characters27876584
Distinct characters31
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10944 ?
Unique (%)2.0%

Sample

1st rowhttps://collections.nmnh.si.edu/media/?i=15952021
2nd rowhttps://collections.nmnh.si.edu/media/?i=15835939
3rd rowhttps://collections.nmnh.si.edu/media/?i=15842844
4th rowhttps://collections.nmnh.si.edu/media/?i=15825404
5th rowhttps://collections.nmnh.si.edu/media/?i=15809714
ValueCountFrequency (%)
https://collections.nmnh.si.edu/media/?i=15806683 99
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=15817157 98
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=15840951 63
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=15840941 63
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=15840942 63
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=7010826 59
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=15840950 54
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=15811485 49
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=15824469 48
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=15840962 46
 
< 0.1%
Other values (77185) 608830
99.9%
2025-01-14T11:49:40.090319image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2234264
 
8.0%
/ 2234264
 
8.0%
t 1675698
 
6.0%
s 1675698
 
6.0%
. 1675698
 
6.0%
n 1675698
 
6.0%
e 1675698
 
6.0%
h 1117132
 
4.0%
d 1117132
 
4.0%
m 1117132
 
4.0%
Other values (21) 11678170
41.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17315546
62.1%
Other Punctuation 5078000
 
18.2%
Decimal Number 4873566
 
17.5%
Math Symbol 558566
 
2.0%
Space Separator 50906
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2234264
12.9%
t 1675698
9.7%
s 1675698
9.7%
n 1675698
9.7%
e 1675698
9.7%
h 1117132
 
6.5%
d 1117132
 
6.5%
m 1117132
 
6.5%
l 1117132
 
6.5%
o 1117132
 
6.5%
Other values (4) 2792830
16.1%
Decimal Number
ValueCountFrequency (%)
1 1052460
21.6%
5 830305
17.0%
8 783646
16.1%
2 428910
8.8%
3 368611
 
7.6%
4 321436
 
6.6%
0 319563
 
6.6%
6 266822
 
5.5%
9 252012
 
5.2%
7 249801
 
5.1%
Other Punctuation
ValueCountFrequency (%)
/ 2234264
44.0%
. 1675698
33.0%
? 558566
 
11.0%
: 558566
 
11.0%
; 50906
 
1.0%
Math Symbol
ValueCountFrequency (%)
= 558566
100.0%
Space Separator
ValueCountFrequency (%)
50906
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17315546
62.1%
Common 10561038
37.9%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 2234264
21.2%
. 1675698
15.9%
1 1052460
10.0%
5 830305
 
7.9%
8 783646
 
7.4%
= 558566
 
5.3%
? 558566
 
5.3%
: 558566
 
5.3%
2 428910
 
4.1%
3 368611
 
3.5%
Other values (7) 1511446
14.3%
Latin
ValueCountFrequency (%)
i 2234264
12.9%
t 1675698
9.7%
s 1675698
9.7%
n 1675698
9.7%
e 1675698
9.7%
h 1117132
 
6.5%
d 1117132
 
6.5%
m 1117132
 
6.5%
l 1117132
 
6.5%
o 1117132
 
6.5%
Other values (4) 2792830
16.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27876584
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 2234264
 
8.0%
/ 2234264
 
8.0%
t 1675698
 
6.0%
s 1675698
 
6.0%
. 1675698
 
6.0%
n 1675698
 
6.0%
e 1675698
 
6.0%
h 1117132
 
4.0%
d 1117132
 
4.0%
m 1117132
 
4.0%
Other values (21) 11678170
41.9%

associatedSequences
Text

Missing 

Distinct4430
Distinct (%)98.7%
Missing580105
Missing (%)99.2%
Memory size4.5 MiB
2025-01-14T11:49:40.170501image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12558
Median length49
Mean length129.0780031
Min length49

Characters and Unicode

Total characters579173
Distinct characters63
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4421 ?
Unique (%)98.5%

Sample

1st rowhttps://www.ncbi.nlm.nih.gov/gquery?term=KM080095
2nd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=JQ176229
3rd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=JQ173910
4th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=KU722483
5th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=FJ547617|https://www.ncbi.nlm.nih.gov/gquery?term=FJ547732|https://www.ncbi.nlm.nih.gov/gquery?term=FJ547781|https://www.ncbi.nlm.nih.gov/gquery?term=FJ547782
ValueCountFrequency (%)
https://www.ncbi.nlm.nih.gov/gquery?term=prjna521985 34
 
0.8%
https://www.ncbi.nlm.nih.gov/gquery?term=ay273835 10
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=ay273864 8
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=ay273832 3
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=fj207364 3
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=fj207374 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=mh778417 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=dq433197 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=fj207379 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=mt456681 1
 
< 0.1%
Other values (4420) 4420
98.5%
2025-01-14T11:49:40.307172image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 46361
 
8.0%
/ 34770
 
6.0%
w 34770
 
6.0%
n 34770
 
6.0%
t 34770
 
6.0%
h 23180
 
4.0%
r 23180
 
4.0%
e 23180
 
4.0%
i 23180
 
4.0%
m 23180
 
4.0%
Other values (53) 277832
48.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 359290
62.0%
Other Punctuation 104311
 
18.0%
Decimal Number 71114
 
12.3%
Uppercase Letter 25344
 
4.4%
Math Symbol 18693
 
3.2%
Dash Punctuation 420
 
0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
K 4122
16.3%
J 3684
14.5%
Q 3175
12.5%
U 2477
9.8%
E 1468
 
5.8%
R 1383
 
5.5%
M 1361
 
5.4%
F 1128
 
4.5%
N 849
 
3.3%
S 753
 
3.0%
Other values (16) 4944
19.5%
Lowercase Letter
ValueCountFrequency (%)
w 34770
 
9.7%
n 34770
 
9.7%
t 34770
 
9.7%
h 23180
 
6.5%
r 23180
 
6.5%
e 23180
 
6.5%
i 23180
 
6.5%
m 23180
 
6.5%
g 23180
 
6.5%
q 11590
 
3.2%
Other values (9) 104310
29.0%
Decimal Number
ValueCountFrequency (%)
7 9784
13.8%
1 8422
11.8%
2 7230
10.2%
5 7006
9.9%
4 6944
9.8%
9 6757
9.5%
0 6595
9.3%
3 6298
8.9%
8 6093
8.6%
6 5985
8.4%
Other Punctuation
ValueCountFrequency (%)
. 46361
44.4%
/ 34770
33.3%
? 11590
 
11.1%
: 11590
 
11.1%
Math Symbol
ValueCountFrequency (%)
= 11590
62.0%
| 7103
38.0%
Dash Punctuation
ValueCountFrequency (%)
- 420
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 384634
66.4%
Common 194539
33.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
w 34770
 
9.0%
n 34770
 
9.0%
t 34770
 
9.0%
h 23180
 
6.0%
r 23180
 
6.0%
e 23180
 
6.0%
i 23180
 
6.0%
m 23180
 
6.0%
g 23180
 
6.0%
q 11590
 
3.0%
Other values (35) 129654
33.7%
Common
ValueCountFrequency (%)
. 46361
23.8%
/ 34770
17.9%
= 11590
 
6.0%
? 11590
 
6.0%
: 11590
 
6.0%
7 9784
 
5.0%
1 8422
 
4.3%
2 7230
 
3.7%
| 7103
 
3.7%
5 7006
 
3.6%
Other values (8) 39093
20.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 579173
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 46361
 
8.0%
/ 34770
 
6.0%
w 34770
 
6.0%
n 34770
 
6.0%
t 34770
 
6.0%
h 23180
 
4.0%
r 23180
 
4.0%
e 23180
 
4.0%
i 23180
 
4.0%
m 23180
 
4.0%
Other values (53) 277832
48.0%

occurrenceRemarks
Text

Missing 

Distinct7341
Distinct (%)60.3%
Missing572414
Missing (%)97.9%
Memory size4.5 MiB
2025-01-14T11:49:40.486904image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6354
Median length555
Mean length50.68484152
Min length1

Characters and Unicode

Total characters617240
Distinct characters102
Distinct categories15 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6370 ?
Unique (%)52.3%

Sample

1st rowcarcass saved
2nd rowfertile
3rd rowA second soft part color is listed, but it is in French. It needs translated; the handwriting is somewhat smushed and hard to read. Appears to be "Patte et tour des yeux carminis." [Feet and eye ring carmine?]
4th rowbreeding
5th rowW.P. Taylor
ValueCountFrequency (%)
of 4593
 
4.4%
in 2349
 
2.2%
as 2209
 
2.1%
the 2118
 
2.0%
usnm 2055
 
2.0%
tag 1748
 
1.7%
specimens 1534
 
1.5%
cataloged 1516
 
1.4%
1422
 
1.4%
originally 1393
 
1.3%
Other values (10725) 84151
80.1%
2025-01-14T11:49:40.754330image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
92912
15.1%
e 51486
 
8.3%
a 37742
 
6.1%
n 34944
 
5.7%
o 33997
 
5.5%
i 32379
 
5.2%
t 32167
 
5.2%
s 26495
 
4.3%
r 25801
 
4.2%
l 22827
 
3.7%
Other values (92) 226490
36.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 429839
69.6%
Space Separator 92912
 
15.1%
Uppercase Letter 38725
 
6.3%
Decimal Number 34189
 
5.5%
Other Punctuation 18120
 
2.9%
Dash Punctuation 1707
 
0.3%
Open Punctuation 745
 
0.1%
Close Punctuation 743
 
0.1%
Math Symbol 219
 
< 0.1%
Final Punctuation 12
 
< 0.1%
Other values (5) 29
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 51486
12.0%
a 37742
 
8.8%
n 34944
 
8.1%
o 33997
 
7.9%
i 32379
 
7.5%
t 32167
 
7.5%
s 26495
 
6.2%
r 25801
 
6.0%
l 22827
 
5.3%
d 18987
 
4.4%
Other values (20) 113014
26.3%
Uppercase Letter
ValueCountFrequency (%)
S 4637
12.0%
N 4058
 
10.5%
M 3832
 
9.9%
U 3571
 
9.2%
C 3039
 
7.8%
O 2136
 
5.5%
A 1965
 
5.1%
T 1912
 
4.9%
B 1512
 
3.9%
F 1474
 
3.8%
Other values (16) 10589
27.3%
Other Punctuation
ValueCountFrequency (%)
. 7077
39.1%
, 3513
19.4%
: 1896
 
10.5%
; 1812
 
10.0%
" 1381
 
7.6%
# 887
 
4.9%
/ 499
 
2.8%
' 346
 
1.9%
& 321
 
1.8%
? 158
 
0.9%
Other values (5) 230
 
1.3%
Decimal Number
ValueCountFrequency (%)
1 5919
17.3%
2 4779
14.0%
0 3893
11.4%
5 3430
10.0%
3 3030
8.9%
6 3028
8.9%
4 2960
8.7%
9 2714
7.9%
8 2382
7.0%
7 2054
 
6.0%
Math Symbol
ValueCountFrequency (%)
+ 92
42.0%
= 68
31.1%
> 34
 
15.5%
< 14
 
6.4%
± 9
 
4.1%
~ 2
 
0.9%
Dash Punctuation
ValueCountFrequency (%)
- 1699
99.5%
6
 
0.4%
1
 
0.1%
1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 633
85.0%
[ 112
 
15.0%
Close Punctuation
ValueCountFrequency (%)
) 632
85.1%
] 111
 
14.9%
Space Separator
ValueCountFrequency (%)
92912
100.0%
Final Punctuation
ValueCountFrequency (%)
12
100.0%
Initial Punctuation
ValueCountFrequency (%)
12
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 11
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 3
100.0%
Other Symbol
ValueCountFrequency (%)
° 2
100.0%
Other Letter
ValueCountFrequency (%)
º 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 468565
75.9%
Common 148675
 
24.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 51486
 
11.0%
a 37742
 
8.1%
n 34944
 
7.5%
o 33997
 
7.3%
i 32379
 
6.9%
t 32167
 
6.9%
s 26495
 
5.7%
r 25801
 
5.5%
l 22827
 
4.9%
d 18987
 
4.1%
Other values (47) 151740
32.4%
Common
ValueCountFrequency (%)
92912
62.5%
. 7077
 
4.8%
1 5919
 
4.0%
2 4779
 
3.2%
0 3893
 
2.6%
, 3513
 
2.4%
5 3430
 
2.3%
3 3030
 
2.0%
6 3028
 
2.0%
4 2960
 
2.0%
Other values (35) 18134
 
12.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 617187
> 99.9%
Punctuation 32
 
< 0.1%
None 21
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
92912
15.1%
e 51486
 
8.3%
a 37742
 
6.1%
n 34944
 
5.7%
o 33997
 
5.5%
i 32379
 
5.2%
t 32167
 
5.2%
s 26495
 
4.3%
r 25801
 
4.2%
l 22827
 
3.7%
Other values (80) 226437
36.7%
Punctuation
ValueCountFrequency (%)
12
37.5%
12
37.5%
6
18.8%
1
 
3.1%
1
 
3.1%
None
ValueCountFrequency (%)
± 9
42.9%
é 3
 
14.3%
ó 2
 
9.5%
ñ 2
 
9.5%
ç 2
 
9.5%
° 2
 
9.5%
º 1
 
4.8%

eventDate
Text

Missing 

Distinct51251
Distinct (%)9.4%
Missing41253
Missing (%)7.1%
Memory size4.5 MiB
2025-01-14T11:49:40.966705image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length9.765286497
Min length4

Characters and Unicode

Total characters5305861
Distinct characters15
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8010 ?
Unique (%)1.5%

Sample

1st row1859-05
2nd row1883-03-18
3rd row1895-05-26
4th row1924-08-06
5th row1987-04-09
ValueCountFrequency (%)
1865 620
 
0.1%
1877 533
 
0.1%
1966 478
 
0.1%
1926 419
 
0.1%
1939-07 366
 
0.1%
1937 360
 
0.1%
1936 280
 
0.1%
1884 276
 
0.1%
1888 253
 
< 0.1%
1881 251
 
< 0.1%
Other values (51241) 539505
99.3%
2025-01-14T11:49:41.238827image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1042804
19.7%
1 1032724
19.5%
0 808669
15.2%
9 611987
11.5%
2 401316
 
7.6%
8 309090
 
5.8%
6 249482
 
4.7%
3 225464
 
4.2%
5 223598
 
4.2%
4 216175
 
4.1%
Other values (5) 184552
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4262236
80.3%
Dash Punctuation 1042804
 
19.7%
Other Punctuation 817
 
< 0.1%
Space Separator 2
 
< 0.1%
Lowercase Letter 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1032724
24.2%
0 808669
19.0%
9 611987
14.4%
2 401316
 
9.4%
8 309090
 
7.3%
6 249482
 
5.9%
3 225464
 
5.3%
5 223598
 
5.2%
4 216175
 
5.1%
7 183731
 
4.3%
Lowercase Letter
ValueCountFrequency (%)
o 1
50.0%
r 1
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 1042804
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 817
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5305859
> 99.9%
Latin 2
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
- 1042804
19.7%
1 1032724
19.5%
0 808669
15.2%
9 611987
11.5%
2 401316
 
7.6%
8 309090
 
5.8%
6 249482
 
4.7%
3 225464
 
4.2%
5 223598
 
4.2%
4 216175
 
4.1%
Other values (3) 184550
 
3.5%
Latin
ValueCountFrequency (%)
o 1
50.0%
r 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5305861
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1042804
19.7%
1 1032724
19.5%
0 808669
15.2%
9 611987
11.5%
2 401316
 
7.6%
8 309090
 
5.8%
6 249482
 
4.7%
3 225464
 
4.2%
5 223598
 
4.2%
4 216175
 
4.1%
Other values (5) 184552
 
3.5%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing55224
Missing (%)9.4%
Memory size4.5 MiB
2025-01-14T11:49:41.448231image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.722368938
Min length1

Characters and Unicode

Total characters1441135
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row151
2nd row77
3rd row146
4th row219
5th row99
ValueCountFrequency (%)
151 3518
 
0.7%
181 3362
 
0.6%
120 2982
 
0.6%
212 2837
 
0.5%
152 2631
 
0.5%
140 2506
 
0.5%
141 2480
 
0.5%
90 2458
 
0.5%
134 2428
 
0.5%
135 2416
 
0.5%
Other values (356) 501750
94.8%
2025-01-14T11:49:41.713391image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 306663
21.3%
2 236590
16.4%
3 179551
12.5%
5 113269
 
7.9%
4 112416
 
7.8%
6 107173
 
7.4%
7 97806
 
6.8%
0 96808
 
6.7%
8 95828
 
6.6%
9 95031
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1441135
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 306663
21.3%
2 236590
16.4%
3 179551
12.5%
5 113269
 
7.9%
4 112416
 
7.8%
6 107173
 
7.4%
7 97806
 
6.8%
0 96808
 
6.7%
8 95828
 
6.6%
9 95031
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Common 1441135
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 306663
21.3%
2 236590
16.4%
3 179551
12.5%
5 113269
 
7.9%
4 112416
 
7.8%
6 107173
 
7.4%
7 97806
 
6.8%
0 96808
 
6.7%
8 95828
 
6.6%
9 95031
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1441135
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 306663
21.3%
2 236590
16.4%
3 179551
12.5%
5 113269
 
7.9%
4 112416
 
7.8%
6 107173
 
7.4%
7 97806
 
6.8%
0 96808
 
6.7%
8 95828
 
6.6%
9 95031
 
6.6%

endDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing55046
Missing (%)9.4%
Memory size4.5 MiB
2025-01-14T11:49:41.915654image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.72220166
Min length1

Characters and Unicode

Total characters1441531
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row151
2nd row77
3rd row146
4th row219
5th row99
ValueCountFrequency (%)
151 3514
 
0.7%
181 3366
 
0.6%
120 2999
 
0.6%
212 2840
 
0.5%
152 2637
 
0.5%
59 2554
 
0.5%
140 2508
 
0.5%
141 2480
 
0.5%
90 2455
 
0.5%
134 2427
 
0.5%
Other values (356) 501766
94.8%
2025-01-14T11:49:42.178457image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 306696
21.3%
2 236599
16.4%
3 179569
12.5%
5 113353
 
7.9%
4 112451
 
7.8%
6 107235
 
7.4%
7 97784
 
6.8%
0 96826
 
6.7%
8 95830
 
6.6%
9 95188
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1441531
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 306696
21.3%
2 236599
16.4%
3 179569
12.5%
5 113353
 
7.9%
4 112451
 
7.8%
6 107235
 
7.4%
7 97784
 
6.8%
0 96826
 
6.7%
8 95830
 
6.6%
9 95188
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Common 1441531
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 306696
21.3%
2 236599
16.4%
3 179569
12.5%
5 113353
 
7.9%
4 112451
 
7.8%
6 107235
 
7.4%
7 97784
 
6.8%
0 96826
 
6.7%
8 95830
 
6.6%
9 95188
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1441531
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 306696
21.3%
2 236599
16.4%
3 179569
12.5%
5 113353
 
7.9%
4 112451
 
7.8%
6 107235
 
7.4%
7 97784
 
6.8%
0 96826
 
6.7%
8 95830
 
6.6%
9 95188
 
6.6%

year
Text

Missing 

Distinct204
Distinct (%)< 0.1%
Missing41253
Missing (%)7.1%
Memory size4.5 MiB
2025-01-14T11:49:42.362926image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2173356
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row1859
2nd row1883
3rd row1895
4th row1924
5th row1987
ValueCountFrequency (%)
1965 14461
 
2.7%
1964 13004
 
2.4%
1966 10904
 
2.0%
1912 9421
 
1.7%
1911 8199
 
1.5%
1949 8030
 
1.5%
1923 7875
 
1.4%
1950 6975
 
1.3%
1967 6970
 
1.3%
1892 6943
 
1.3%
Other values (194) 450557
82.9%
2025-01-14T11:49:42.599530image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 642158
29.5%
9 524962
24.2%
8 217844
 
10.0%
6 138244
 
6.4%
0 135668
 
6.2%
2 113158
 
5.2%
4 110558
 
5.1%
5 102854
 
4.7%
3 100877
 
4.6%
7 87033
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2173356
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 642158
29.5%
9 524962
24.2%
8 217844
 
10.0%
6 138244
 
6.4%
0 135668
 
6.2%
2 113158
 
5.2%
4 110558
 
5.1%
5 102854
 
4.7%
3 100877
 
4.6%
7 87033
 
4.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2173356
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 642158
29.5%
9 524962
24.2%
8 217844
 
10.0%
6 138244
 
6.4%
0 135668
 
6.2%
2 113158
 
5.2%
4 110558
 
5.1%
5 102854
 
4.7%
3 100877
 
4.6%
7 87033
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2173356
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 642158
29.5%
9 524962
24.2%
8 217844
 
10.0%
6 138244
 
6.4%
0 135668
 
6.2%
2 113158
 
5.2%
4 110558
 
5.1%
5 102854
 
4.7%
3 100877
 
4.6%
7 87033
 
4.0%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing53401
Missing (%)9.1%
Memory size4.5 MiB
2025-01-14T11:49:42.665698image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.178931872
Min length1

Characters and Unicode

Total characters626238
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5
2nd row3
3rd row5
4th row8
5th row4
ValueCountFrequency (%)
5 70381
13.2%
6 61192
11.5%
4 54193
10.2%
3 50558
9.5%
7 46977
8.8%
2 40493
7.6%
8 39923
7.5%
9 37769
7.1%
10 35514
6.7%
1 34658
6.5%
Other values (2) 59533
11.2%
2025-01-14T11:49:42.891766image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 160487
25.6%
5 70381
11.2%
2 69244
11.1%
6 61192
 
9.8%
4 54193
 
8.7%
3 50558
 
8.1%
7 46977
 
7.5%
8 39923
 
6.4%
9 37769
 
6.0%
0 35514
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 626238
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 160487
25.6%
5 70381
11.2%
2 69244
11.1%
6 61192
 
9.8%
4 54193
 
8.7%
3 50558
 
8.1%
7 46977
 
7.5%
8 39923
 
6.4%
9 37769
 
6.0%
0 35514
 
5.7%

Most occurring scripts

ValueCountFrequency (%)
Common 626238
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 160487
25.6%
5 70381
11.2%
2 69244
11.1%
6 61192
 
9.8%
4 54193
 
8.7%
3 50558
 
8.1%
7 46977
 
7.5%
8 39923
 
6.4%
9 37769
 
6.0%
0 35514
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 626238
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 160487
25.6%
5 70381
11.2%
2 69244
11.1%
6 61192
 
9.8%
4 54193
 
8.7%
3 50558
 
8.1%
7 46977
 
7.5%
8 39923
 
6.4%
9 37769
 
6.0%
0 35514
 
5.7%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing74067
Missing (%)12.7%
Memory size4.5 MiB
2025-01-14T11:49:42.967124image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.70740904
Min length1

Characters and Unicode

Total characters871675
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row18
2nd row26
3rd row6
4th row9
5th row1
ValueCountFrequency (%)
20 17983
 
3.5%
10 17958
 
3.5%
8 17685
 
3.5%
15 17685
 
3.5%
12 17464
 
3.4%
21 17462
 
3.4%
24 17317
 
3.4%
4 17158
 
3.4%
22 17154
 
3.4%
16 17136
 
3.4%
Other values (21) 335523
65.7%
2025-01-14T11:49:43.093987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 228783
26.2%
2 218106
25.0%
3 73770
 
8.5%
4 51243
 
5.9%
8 51026
 
5.9%
0 50843
 
5.8%
5 50220
 
5.8%
6 49847
 
5.7%
7 49503
 
5.7%
9 48334
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 871675
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 228783
26.2%
2 218106
25.0%
3 73770
 
8.5%
4 51243
 
5.9%
8 51026
 
5.9%
0 50843
 
5.8%
5 50220
 
5.8%
6 49847
 
5.7%
7 49503
 
5.7%
9 48334
 
5.5%

Most occurring scripts

ValueCountFrequency (%)
Common 871675
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 228783
26.2%
2 218106
25.0%
3 73770
 
8.5%
4 51243
 
5.9%
8 51026
 
5.9%
0 50843
 
5.8%
5 50220
 
5.8%
6 49847
 
5.7%
7 49503
 
5.7%
9 48334
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 871675
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 228783
26.2%
2 218106
25.0%
3 73770
 
8.5%
4 51243
 
5.9%
8 51026
 
5.9%
0 50843
 
5.8%
5 50220
 
5.8%
6 49847
 
5.7%
7 49503
 
5.7%
9 48334
 
5.5%

verbatimEventDate
Text

Missing 

Distinct43994
Distinct (%)12.6%
Missing235442
Missing (%)40.3%
Memory size4.5 MiB
2025-01-14T11:49:43.282234image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length60
Median length11
Mean length10.64060719
Min length1

Characters and Unicode

Total characters3715168
Distinct characters77
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10311 ?
Unique (%)3.0%

Sample

1st row-- May 1859
2nd row18 Mar 1883
3rd row26 May 1895
4th row6 Aug 1924
5th row9 Apr 1987
ValueCountFrequency (%)
149965
 
14.3%
may 43235
 
4.1%
jun 37603
 
3.6%
apr 31571
 
3.0%
mar 27292
 
2.6%
jul 27206
 
2.6%
aug 23700
 
2.3%
feb 21866
 
2.1%
sep 21167
 
2.0%
jan 18181
 
1.7%
Other values (727) 644585
61.6%
2025-01-14T11:49:43.537363image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
697221
18.8%
1 503447
13.6%
- 381992
 
10.3%
9 327404
 
8.8%
2 174483
 
4.7%
8 174195
 
4.7%
6 106883
 
2.9%
3 99628
 
2.7%
4 93965
 
2.5%
a 89421
 
2.4%
Other values (67) 1066529
28.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1729072
46.5%
Space Separator 697221
18.8%
Lowercase Letter 605710
 
16.3%
Dash Punctuation 381992
 
10.3%
Uppercase Letter 300576
 
8.1%
Other Punctuation 550
 
< 0.1%
Close Punctuation 16
 
< 0.1%
Open Punctuation 16
 
< 0.1%
Math Symbol 8
 
< 0.1%
Format 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 89421
14.8%
u 89025
14.7%
r 60311
10.0%
e 56954
9.4%
n 56861
9.4%
p 53345
8.8%
y 43351
7.2%
c 31070
 
5.1%
l 28240
 
4.7%
g 24158
 
4.0%
Other values (14) 72974
12.0%
Uppercase Letter
ValueCountFrequency (%)
J 83206
27.7%
M 70639
23.5%
A 55418
18.4%
F 22373
 
7.4%
S 21945
 
7.3%
O 18184
 
6.0%
N 15173
 
5.0%
D 12829
 
4.3%
W 412
 
0.1%
I 175
 
0.1%
Other values (14) 222
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 503447
29.1%
9 327404
18.9%
2 174483
 
10.1%
8 174195
 
10.1%
6 106883
 
6.2%
3 99628
 
5.8%
4 93965
 
5.4%
0 87653
 
5.1%
5 82545
 
4.8%
7 78869
 
4.6%
Other Punctuation
ValueCountFrequency (%)
/ 176
32.0%
. 144
26.2%
, 89
16.2%
? 49
 
8.9%
' 34
 
6.2%
: 32
 
5.8%
& 12
 
2.2%
\ 11
 
2.0%
" 2
 
0.4%
# 1
 
0.2%
Close Punctuation
ValueCountFrequency (%)
] 8
50.0%
) 8
50.0%
Open Punctuation
ValueCountFrequency (%)
( 8
50.0%
[ 8
50.0%
Space Separator
ValueCountFrequency (%)
697221
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 381992
100.0%
Math Symbol
ValueCountFrequency (%)
= 8
100.0%
Format
ValueCountFrequency (%)
4
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2808882
75.6%
Latin 906286
 
24.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 89421
 
9.9%
u 89025
 
9.8%
J 83206
 
9.2%
M 70639
 
7.8%
r 60311
 
6.7%
e 56954
 
6.3%
n 56861
 
6.3%
A 55418
 
6.1%
p 53345
 
5.9%
y 43351
 
4.8%
Other values (38) 247755
27.3%
Common
ValueCountFrequency (%)
697221
24.8%
1 503447
17.9%
- 381992
13.6%
9 327404
11.7%
2 174483
 
6.2%
8 174195
 
6.2%
6 106883
 
3.8%
3 99628
 
3.5%
4 93965
 
3.3%
0 87653
 
3.1%
Other values (19) 162011
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3715164
> 99.9%
Punctuation 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
697221
18.8%
1 503447
13.6%
- 381992
 
10.3%
9 327404
 
8.8%
2 174483
 
4.7%
8 174195
 
4.7%
6 106883
 
2.9%
3 99628
 
2.7%
4 93965
 
2.5%
a 89421
 
2.4%
Other values (66) 1066525
28.7%
Punctuation
ValueCountFrequency (%)
4
100.0%

habitat
Text

Missing 

Distinct4924
Distinct (%)28.6%
Missing567355
Missing (%)97.1%
Memory size4.5 MiB
2025-01-14T11:49:43.734543image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length191
Median length141
Mean length27.13418808
Min length3

Characters and Unicode

Total characters467712
Distinct characters82
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3478 ?
Unique (%)20.2%

Sample

1st rowIN OPEN OCEAN AT 0835
2nd rowdense marshy grass
3rd rowAlong lake shore, water and dead brush
4th rowairport
5th rowmontane forest edge
ValueCountFrequency (%)
forest 6854
 
9.3%
with 2343
 
3.2%
open 1915
 
2.6%
of 1628
 
2.2%
in 1549
 
2.1%
and 1461
 
2.0%
scrub 1279
 
1.7%
edge 1213
 
1.6%
945
 
1.3%
on 919
 
1.2%
Other values (2526) 53491
72.7%
2025-01-14T11:49:44.012479image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
56360
 
12.1%
e 41846
 
8.9%
o 33590
 
7.2%
a 33285
 
7.1%
s 31715
 
6.8%
r 31696
 
6.8%
t 25427
 
5.4%
n 24905
 
5.3%
i 21550
 
4.6%
l 17791
 
3.8%
Other values (72) 149547
32.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 376043
80.4%
Space Separator 56360
 
12.1%
Uppercase Letter 25209
 
5.4%
Other Punctuation 6419
 
1.4%
Dash Punctuation 1495
 
0.3%
Decimal Number 1436
 
0.3%
Open Punctuation 365
 
0.1%
Close Punctuation 365
 
0.1%
Math Symbol 20
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 41846
11.1%
o 33590
 
8.9%
a 33285
 
8.9%
s 31715
 
8.4%
r 31696
 
8.4%
t 25427
 
6.8%
n 24905
 
6.6%
i 21550
 
5.7%
l 17791
 
4.7%
d 17630
 
4.7%
Other values (16) 96608
25.7%
Uppercase Letter
ValueCountFrequency (%)
O 2706
 
10.7%
E 2400
 
9.5%
R 2179
 
8.6%
A 2016
 
8.0%
N 1705
 
6.8%
S 1663
 
6.6%
L 1499
 
5.9%
I 1493
 
5.9%
T 1420
 
5.6%
C 1283
 
5.1%
Other values (16) 6845
27.2%
Other Punctuation
ValueCountFrequency (%)
, 4306
67.1%
. 571
 
8.9%
; 545
 
8.5%
& 456
 
7.1%
/ 439
 
6.8%
" 28
 
0.4%
: 27
 
0.4%
' 25
 
0.4%
? 15
 
0.2%
# 4
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 592
41.2%
5 303
21.1%
1 155
 
10.8%
2 135
 
9.4%
3 126
 
8.8%
4 53
 
3.7%
6 27
 
1.9%
8 17
 
1.2%
7 17
 
1.2%
9 11
 
0.8%
Math Symbol
ValueCountFrequency (%)
+ 8
40.0%
< 7
35.0%
= 5
25.0%
Open Punctuation
ValueCountFrequency (%)
( 350
95.9%
[ 15
 
4.1%
Close Punctuation
ValueCountFrequency (%)
) 350
95.9%
] 15
 
4.1%
Space Separator
ValueCountFrequency (%)
56360
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1495
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 401252
85.8%
Common 66460
 
14.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 41846
 
10.4%
o 33590
 
8.4%
a 33285
 
8.3%
s 31715
 
7.9%
r 31696
 
7.9%
t 25427
 
6.3%
n 24905
 
6.2%
i 21550
 
5.4%
l 17791
 
4.4%
d 17630
 
4.4%
Other values (42) 121817
30.4%
Common
ValueCountFrequency (%)
56360
84.8%
, 4306
 
6.5%
- 1495
 
2.2%
0 592
 
0.9%
. 571
 
0.9%
; 545
 
0.8%
& 456
 
0.7%
/ 439
 
0.7%
( 350
 
0.5%
) 350
 
0.5%
Other values (20) 996
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 467712
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
56360
 
12.1%
e 41846
 
8.9%
o 33590
 
7.2%
a 33285
 
7.1%
s 31715
 
6.8%
r 31696
 
6.8%
t 25427
 
5.4%
n 24905
 
5.3%
i 21550
 
4.6%
l 17791
 
3.8%
Other values (72) 149547
32.0%
Distinct6798
Distinct (%)1.2%
Missing2
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-14T11:49:44.214431image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length95
Median length75
Mean length36.76763373
Min length4

Characters and Unicode

Total characters21493991
Distinct characters74
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1458 ?
Unique (%)0.2%

Sample

1st rowSouth America, Paraguay, Asuncion
2nd rowNorth America, United States, Florida
3rd rowNorth America, United States, South Dakota
4th rowNorth America, United States, Maine
5th rowAsia, Philippines, Palawan, Palawan Province
ValueCountFrequency (%)
america 389870
 
13.5%
north 349097
 
12.1%
united 213165
 
7.4%
states 211488
 
7.4%
asia 94981
 
3.3%
south 88499
 
3.1%
africa 52986
 
1.8%
mexico 32547
 
1.1%
panama 31800
 
1.1%
colombia 28517
 
1.0%
Other values (4797) 1384325
48.1%
2025-01-14T11:49:44.485825image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2292685
 
10.7%
a 2269264
 
10.6%
i 1576409
 
7.3%
e 1449846
 
6.7%
t 1415514
 
6.6%
r 1302972
 
6.1%
, 1293349
 
6.0%
o 1083406
 
5.0%
n 1034429
 
4.8%
s 708939
 
3.3%
Other values (64) 7067178
32.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14996132
69.8%
Uppercase Letter 2873711
 
13.4%
Space Separator 2292685
 
10.7%
Other Punctuation 1312996
 
6.1%
Dash Punctuation 16199
 
0.1%
Open Punctuation 1132
 
< 0.1%
Close Punctuation 1131
 
< 0.1%
Decimal Number 3
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2269264
15.1%
i 1576409
10.5%
e 1449846
9.7%
t 1415514
9.4%
r 1302972
8.7%
o 1083406
 
7.2%
n 1034429
 
6.9%
s 708939
 
4.7%
c 702577
 
4.7%
h 651096
 
4.3%
Other values (19) 2801680
18.7%
Uppercase Letter
ValueCountFrequency (%)
A 655752
22.8%
N 422462
14.7%
S 371915
12.9%
U 235640
 
8.2%
C 213796
 
7.4%
M 131471
 
4.6%
P 129994
 
4.5%
I 77028
 
2.7%
B 69307
 
2.4%
T 68706
 
2.4%
Other values (16) 497640
17.3%
Other Punctuation
ValueCountFrequency (%)
, 1293349
98.5%
' 6575
 
0.5%
. 5181
 
0.4%
? 4098
 
0.3%
/ 3790
 
0.3%
& 2
 
< 0.1%
\ 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 1
33.3%
8 1
33.3%
6 1
33.3%
Dash Punctuation
ValueCountFrequency (%)
- 16138
99.6%
61
 
0.4%
Open Punctuation
ValueCountFrequency (%)
( 1125
99.4%
[ 7
 
0.6%
Close Punctuation
ValueCountFrequency (%)
) 1124
99.4%
] 7
 
0.6%
Math Symbol
ValueCountFrequency (%)
+ 1
50.0%
~ 1
50.0%
Space Separator
ValueCountFrequency (%)
2292685
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17869843
83.1%
Common 3624148
 
16.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2269264
12.7%
i 1576409
 
8.8%
e 1449846
 
8.1%
t 1415514
 
7.9%
r 1302972
 
7.3%
o 1083406
 
6.1%
n 1034429
 
5.8%
s 708939
 
4.0%
c 702577
 
3.9%
A 655752
 
3.7%
Other values (45) 5670735
31.7%
Common
ValueCountFrequency (%)
2292685
63.3%
, 1293349
35.7%
- 16138
 
0.4%
' 6575
 
0.2%
. 5181
 
0.1%
? 4098
 
0.1%
/ 3790
 
0.1%
( 1125
 
< 0.1%
) 1124
 
< 0.1%
61
 
< 0.1%
Other values (9) 22
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21493923
> 99.9%
Punctuation 61
 
< 0.1%
None 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2292685
 
10.7%
a 2269264
 
10.6%
i 1576409
 
7.3%
e 1449846
 
6.7%
t 1415514
 
6.6%
r 1302972
 
6.1%
, 1293349
 
6.0%
o 1083406
 
5.0%
n 1034429
 
4.8%
s 708939
 
3.3%
Other values (60) 7067110
32.9%
Punctuation
ValueCountFrequency (%)
61
100.0%
None
ValueCountFrequency (%)
ô 4
57.1%
é 2
28.6%
ä 1
 
14.3%

continent
Text

Missing 

Distinct40
Distinct (%)< 0.1%
Missing12727
Missing (%)2.2%
Memory size4.5 MiB
2025-01-14T11:49:44.548127image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length13
Mean length11.06538082
Min length4

Characters and Unicode

Total characters6327904
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)< 0.1%

Sample

1st rowSouth America
2nd rowNorth America
3rd rowNorth America
4th rowNorth America
5th rowAsia
ValueCountFrequency (%)
america 389824
38.5%
north 337992
33.4%
asia 94981
 
9.4%
south 74551
 
7.4%
africa 47173
 
4.7%
ocean 25976
 
2.6%
pacific 19043
 
1.9%
europe 8238
 
0.8%
australia 6861
 
0.7%
atlantic 4034
 
0.4%
Other values (5) 4117
 
0.4%
2025-01-14T11:49:44.667459image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 790971
12.5%
a 598117
9.5%
i 584448
9.2%
A 543672
8.6%
c 506691
8.0%
440925
 
7.0%
t 429029
 
6.8%
e 424122
 
6.7%
o 420865
 
6.7%
h 412627
 
6.5%
Other values (17) 1176437
18.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4872457
77.0%
Uppercase Letter 1012246
 
16.0%
Space Separator 440925
 
7.0%
Other Punctuation 2276
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 790971
16.2%
a 598117
12.3%
i 584448
12.0%
c 506691
10.4%
t 429029
8.8%
e 424122
8.7%
o 420865
8.6%
h 412627
8.5%
m 389824
8.0%
s 101842
 
2.1%
Other values (6) 213921
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
A 543672
53.7%
N 337992
33.4%
S 74635
 
7.4%
O 25976
 
2.6%
P 19043
 
1.9%
E 8238
 
0.8%
I 2690
 
0.3%
Other Punctuation
ValueCountFrequency (%)
, 1732
76.1%
? 414
 
18.2%
/ 130
 
5.7%
Space Separator
ValueCountFrequency (%)
440925
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5884703
93.0%
Common 443201
 
7.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 790971
13.4%
a 598117
10.2%
i 584448
9.9%
A 543672
9.2%
c 506691
8.6%
t 429029
7.3%
e 424122
7.2%
o 420865
7.2%
h 412627
7.0%
m 389824
6.6%
Other values (13) 784337
13.3%
Common
ValueCountFrequency (%)
440925
99.5%
, 1732
 
0.4%
? 414
 
0.1%
/ 130
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6327904
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 790971
12.5%
a 598117
9.5%
i 584448
9.2%
A 543672
8.6%
c 506691
8.0%
440925
 
7.0%
t 429029
 
6.8%
e 424122
 
6.7%
o 420865
 
6.7%
h 412627
 
6.5%
Other values (17) 1176437
18.6%

waterBody
Text

Missing 

Distinct67
Distinct (%)0.3%
Missing558515
Missing (%)95.5%
Memory size4.5 MiB
2025-01-14T11:49:44.738992image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length55
Median length19
Mean length20.14311462
Min length8

Characters and Unicode

Total characters525272
Distinct characters45
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)0.1%

Sample

1st rowArctic Ocean
2nd rowNorth Pacific Ocean
3rd rowNorth Pacific Ocean
4th rowNorth Pacific Ocean
5th rowNorth Pacific Ocean
ValueCountFrequency (%)
ocean 26055
32.3%
pacific 19043
23.6%
north 16048
19.9%
south 6719
 
8.3%
atlantic 4113
 
5.1%
indian 2690
 
3.3%
sea 2523
 
3.1%
mediterranean 1992
 
2.5%
weddell 131
 
0.2%
arctic 125
 
0.2%
Other values (57) 1126
 
1.4%
2025-01-14T11:49:44.867483image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 68650
13.1%
a 59282
11.3%
54488
10.4%
i 47442
9.0%
n 40099
 
7.6%
e 35322
 
6.7%
t 33362
 
6.4%
O 26120
 
5.0%
o 23090
 
4.4%
h 23023
 
4.4%
Other values (35) 114394
21.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 387593
73.8%
Uppercase Letter 80498
 
15.3%
Space Separator 54488
 
10.4%
Other Punctuation 2693
 
0.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 68650
17.7%
a 59282
15.3%
i 47442
12.2%
n 40099
10.3%
e 35322
9.1%
t 33362
8.6%
o 23090
 
6.0%
h 23023
 
5.9%
r 20677
 
5.3%
f 19227
 
5.0%
Other values (14) 17419
 
4.5%
Uppercase Letter
ValueCountFrequency (%)
O 26120
32.4%
P 19099
23.7%
N 16052
19.9%
S 9360
 
11.6%
A 4242
 
5.3%
I 2690
 
3.3%
M 1995
 
2.5%
B 240
 
0.3%
C 217
 
0.3%
W 158
 
0.2%
Other values (8) 325
 
0.4%
Other Punctuation
ValueCountFrequency (%)
, 2671
99.2%
? 22
 
0.8%
Space Separator
ValueCountFrequency (%)
54488
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 468091
89.1%
Common 57181
 
10.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 68650
14.7%
a 59282
12.7%
i 47442
10.1%
n 40099
8.6%
e 35322
 
7.5%
t 33362
 
7.1%
O 26120
 
5.6%
o 23090
 
4.9%
h 23023
 
4.9%
r 20677
 
4.4%
Other values (32) 91024
19.4%
Common
ValueCountFrequency (%)
54488
95.3%
, 2671
 
4.7%
? 22
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 525272
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 68650
13.1%
a 59282
11.3%
54488
10.4%
i 47442
9.0%
n 40099
 
7.6%
e 35322
 
6.7%
t 33362
 
6.4%
O 26120
 
5.0%
o 23090
 
4.4%
h 23023
 
4.4%
Other values (35) 114394
21.8%
Distinct412
Distinct (%)0.1%
Missing5361
Missing (%)0.9%
Memory size4.5 MiB
2025-01-14T11:49:45.055598image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length44
Median length33
Mean length10.04032243
Min length4

Characters and Unicode

Total characters5815666
Distinct characters61
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)< 0.1%

Sample

1st rowParaguay
2nd rowUnited States
3rd rowUnited States
4th rowUnited States
5th rowPhilippines
ValueCountFrequency (%)
united 213165
24.6%
states 211303
24.4%
colombia 28517
 
3.3%
mexico 28030
 
3.2%
panama 27140
 
3.1%
canada 17446
 
2.0%
thailand 17424
 
2.0%
philippines 16445
 
1.9%
china 14052
 
1.6%
islands 13809
 
1.6%
Other values (324) 278914
32.2%
2025-01-14T11:49:45.318740image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 731082
12.6%
t 721459
12.4%
e 576943
9.9%
i 510112
 
8.8%
n 497757
 
8.6%
s 311024
 
5.3%
d 305691
 
5.3%
287014
 
4.9%
U 228719
 
3.9%
S 225667
 
3.9%
Other values (51) 1420198
24.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4662654
80.2%
Uppercase Letter 860284
 
14.8%
Space Separator 287014
 
4.9%
Other Punctuation 4850
 
0.1%
Open Punctuation 428
 
< 0.1%
Close Punctuation 428
 
< 0.1%
Dash Punctuation 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 731082
15.7%
t 721459
15.5%
e 576943
12.4%
i 510112
10.9%
n 497757
10.7%
s 311024
6.7%
d 305691
6.6%
o 192414
 
4.1%
l 142950
 
3.1%
m 83495
 
1.8%
Other values (17) 589727
12.6%
Uppercase Letter
ValueCountFrequency (%)
U 228719
26.6%
S 225667
26.2%
C 79985
 
9.3%
P 55781
 
6.5%
M 41357
 
4.8%
I 36209
 
4.2%
T 26625
 
3.1%
A 23262
 
2.7%
E 18738
 
2.2%
R 17032
 
2.0%
Other values (14) 106909
12.4%
Other Punctuation
ValueCountFrequency (%)
. 2626
54.1%
? 1505
31.0%
, 421
 
8.7%
/ 295
 
6.1%
' 2
 
< 0.1%
\ 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
287014
100.0%
Open Punctuation
ValueCountFrequency (%)
( 428
100.0%
Close Punctuation
ValueCountFrequency (%)
) 428
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5522938
95.0%
Common 292728
 
5.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 731082
13.2%
t 721459
13.1%
e 576943
10.4%
i 510112
9.2%
n 497757
9.0%
s 311024
 
5.6%
d 305691
 
5.5%
U 228719
 
4.1%
S 225667
 
4.1%
o 192414
 
3.5%
Other values (41) 1222070
22.1%
Common
ValueCountFrequency (%)
287014
98.0%
. 2626
 
0.9%
? 1505
 
0.5%
( 428
 
0.1%
) 428
 
0.1%
, 421
 
0.1%
/ 295
 
0.1%
- 8
 
< 0.1%
' 2
 
< 0.1%
\ 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5815664
> 99.9%
None 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 731082
12.6%
t 721459
12.4%
e 576943
9.9%
i 510112
 
8.8%
n 497757
 
8.6%
s 311024
 
5.3%
d 305691
 
5.3%
287014
 
4.9%
U 228719
 
3.9%
S 225667
 
3.9%
Other values (50) 1420196
24.4%
None
ValueCountFrequency (%)
ô 2
100.0%

stateProvince
Text

Missing 

Distinct2242
Distinct (%)0.5%
Missing93871
Missing (%)16.1%
Memory size4.5 MiB
2025-01-14T11:49:45.495628image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length71
Median length40
Mean length9.131608388
Min length3

Characters and Unicode

Total characters4481072
Distinct characters67
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique420 ?
Unique (%)0.1%

Sample

1st rowAsuncion
2nd rowFlorida
3rd rowSouth Dakota
4th rowMaine
5th rowPalawan
ValueCountFrequency (%)
california 23409
 
3.6%
new 20454
 
3.1%
alaska 19385
 
3.0%
virginia 14953
 
2.3%
arizona 13147
 
2.0%
maryland 10719
 
1.6%
florida 10644
 
1.6%
texas 9775
 
1.5%
columbia 9291
 
1.4%
island 9097
 
1.4%
Other values (2044) 512747
78.4%
2025-01-14T11:49:45.748380image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 688102
15.4%
i 363250
 
8.1%
n 330347
 
7.4%
o 310192
 
6.9%
r 284630
 
6.4%
e 240206
 
5.4%
l 198665
 
4.4%
s 197499
 
4.4%
162900
 
3.6%
t 158835
 
3.5%
Other values (57) 1546446
34.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3642324
81.3%
Uppercase Letter 655432
 
14.6%
Space Separator 162900
 
3.6%
Dash Punctuation 12832
 
0.3%
Other Punctuation 7148
 
0.2%
Open Punctuation 216
 
< 0.1%
Close Punctuation 216
 
< 0.1%
Decimal Number 3
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 688102
18.9%
i 363250
10.0%
n 330347
9.1%
o 310192
8.5%
r 284630
 
7.8%
e 240206
 
6.6%
l 198665
 
5.5%
s 197499
 
5.4%
t 158835
 
4.4%
u 137454
 
3.8%
Other values (18) 733144
20.1%
Uppercase Letter
ValueCountFrequency (%)
C 87330
13.3%
M 61534
 
9.4%
A 60288
 
9.2%
N 58797
 
9.0%
S 40319
 
6.2%
T 35065
 
5.3%
I 30921
 
4.7%
P 30209
 
4.6%
D 27954
 
4.3%
B 25816
 
3.9%
Other values (16) 197199
30.1%
Other Punctuation
ValueCountFrequency (%)
' 3008
42.1%
? 1713
24.0%
/ 1358
19.0%
. 901
 
12.6%
, 168
 
2.4%
Decimal Number
ValueCountFrequency (%)
2 1
33.3%
8 1
33.3%
6 1
33.3%
Space Separator
ValueCountFrequency (%)
162900
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12832
100.0%
Open Punctuation
ValueCountFrequency (%)
( 216
100.0%
Close Punctuation
ValueCountFrequency (%)
) 216
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4297756
95.9%
Common 183316
 
4.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 688102
16.0%
i 363250
 
8.5%
n 330347
 
7.7%
o 310192
 
7.2%
r 284630
 
6.6%
e 240206
 
5.6%
l 198665
 
4.6%
s 197499
 
4.6%
t 158835
 
3.7%
u 137454
 
3.2%
Other values (44) 1388576
32.3%
Common
ValueCountFrequency (%)
162900
88.9%
- 12832
 
7.0%
' 3008
 
1.6%
? 1713
 
0.9%
/ 1358
 
0.7%
. 901
 
0.5%
( 216
 
0.1%
) 216
 
0.1%
, 168
 
0.1%
+ 1
 
< 0.1%
Other values (3) 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4481070
> 99.9%
None 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 688102
15.4%
i 363250
 
8.1%
n 330347
 
7.4%
o 310192
 
6.9%
r 284630
 
6.4%
e 240206
 
5.4%
l 198665
 
4.4%
s 197499
 
4.4%
162900
 
3.6%
t 158835
 
3.5%
Other values (55) 1546444
34.5%
None
ValueCountFrequency (%)
ô 1
50.0%
é 1
50.0%

county
Text

Missing 

Distinct3216
Distinct (%)1.4%
Missing353572
Missing (%)60.5%
Memory size4.5 MiB
2025-01-14T11:49:45.942933image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length39
Median length31
Mean length9.707878106
Min length1

Characters and Unicode

Total characters2242714
Distinct characters69
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique641 ?
Unique (%)0.3%

Sample

1st rowPalawan Province
2nd rowBergen
3rd rowNorth Solomons Province
4th rowClarke
5th rowAugusta
ValueCountFrequency (%)
area 7116
 
2.1%
census 7108
 
2.1%
province 5993
 
1.8%
bergen 4929
 
1.5%
aleutians 4466
 
1.3%
county 4430
 
1.3%
west 4293
 
1.3%
borough 3777
 
1.1%
san 3628
 
1.1%
latah 3591
 
1.1%
Other values (2933) 289412
85.4%
2025-01-14T11:49:46.208293image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 244424
 
10.9%
e 199313
 
8.9%
n 165552
 
7.4%
o 159648
 
7.1%
r 146489
 
6.5%
i 116092
 
5.2%
107723
 
4.8%
t 103825
 
4.6%
s 98320
 
4.4%
l 98134
 
4.4%
Other values (59) 803194
35.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1783043
79.5%
Uppercase Letter 339922
 
15.2%
Space Separator 107723
 
4.8%
Other Punctuation 7691
 
0.3%
Dash Punctuation 3359
 
0.1%
Open Punctuation 488
 
< 0.1%
Close Punctuation 487
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 244424
13.7%
e 199313
11.2%
n 165552
9.3%
o 159648
9.0%
r 146489
 
8.2%
i 116092
 
6.5%
t 103825
 
5.8%
s 98320
 
5.5%
l 98134
 
5.5%
u 80396
 
4.5%
Other values (19) 370850
20.8%
Uppercase Letter
ValueCountFrequency (%)
C 46264
13.6%
S 28692
 
8.4%
A 28447
 
8.4%
M 26585
 
7.8%
B 26532
 
7.8%
P 24905
 
7.3%
D 17542
 
5.2%
L 16783
 
4.9%
N 15007
 
4.4%
H 14626
 
4.3%
Other values (16) 94539
27.8%
Other Punctuation
ValueCountFrequency (%)
' 3565
46.4%
/ 2007
26.1%
. 1654
21.5%
? 462
 
6.0%
& 2
 
< 0.1%
, 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 3298
98.2%
61
 
1.8%
Open Punctuation
ValueCountFrequency (%)
( 481
98.6%
[ 7
 
1.4%
Close Punctuation
ValueCountFrequency (%)
) 480
98.6%
] 7
 
1.4%
Space Separator
ValueCountFrequency (%)
107723
100.0%
Math Symbol
ValueCountFrequency (%)
~ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2122965
94.7%
Common 119749
 
5.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 244424
 
11.5%
e 199313
 
9.4%
n 165552
 
7.8%
o 159648
 
7.5%
r 146489
 
6.9%
i 116092
 
5.5%
t 103825
 
4.9%
s 98320
 
4.6%
l 98134
 
4.6%
u 80396
 
3.8%
Other values (45) 710772
33.5%
Common
ValueCountFrequency (%)
107723
90.0%
' 3565
 
3.0%
- 3298
 
2.8%
/ 2007
 
1.7%
. 1654
 
1.4%
( 481
 
0.4%
) 480
 
0.4%
? 462
 
0.4%
61
 
0.1%
[ 7
 
< 0.1%
Other values (4) 11
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2242650
> 99.9%
Punctuation 61
 
< 0.1%
None 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 244424
 
10.9%
e 199313
 
8.9%
n 165552
 
7.4%
o 159648
 
7.1%
r 146489
 
6.5%
i 116092
 
5.2%
107723
 
4.8%
t 103825
 
4.6%
s 98320
 
4.4%
l 98134
 
4.4%
Other values (55) 803130
35.8%
Punctuation
ValueCountFrequency (%)
61
100.0%
None
ValueCountFrequency (%)
ô 1
33.3%
é 1
33.3%
ä 1
33.3%

locality
Text

Missing 

Distinct64257
Distinct (%)13.5%
Missing107551
Missing (%)18.4%
Memory size4.5 MiB
2025-01-14T11:49:46.411028image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length929
Median length128
Mean length17.88850853
Min length1

Characters and Unicode

Total characters8533552
Distinct characters112
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33923 ?
Unique (%)7.1%

Sample

1st rowAsuncion
2nd rowBryant, Near
3rd rowOwl'S Head
4th rowNali Barrio, Dam Site, Quezon Municipality
5th rowFort Lee
ValueCountFrequency (%)
island 33520
 
2.4%
mi 31811
 
2.3%
of 23110
 
1.6%
river 22675
 
1.6%
rio 21864
 
1.6%
km 18525
 
1.3%
fort 14257
 
1.0%
san 13196
 
0.9%
near 13030
 
0.9%
lake 11919
 
0.8%
Other values (33466) 1203009
85.5%
2025-01-14T11:49:46.685838image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
929876
 
10.9%
a 913796
 
10.7%
e 542108
 
6.4%
o 539871
 
6.3%
n 524553
 
6.1%
i 502603
 
5.9%
r 415682
 
4.9%
l 354237
 
4.2%
t 336123
 
3.9%
s 280758
 
3.3%
Other values (102) 3193945
37.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5939792
69.6%
Uppercase Letter 1267564
 
14.9%
Space Separator 929876
 
10.9%
Other Punctuation 265001
 
3.1%
Decimal Number 105220
 
1.2%
Dash Punctuation 11355
 
0.1%
Open Punctuation 5161
 
0.1%
Close Punctuation 5158
 
0.1%
Math Symbol 4354
 
0.1%
Connector Punctuation 44
 
< 0.1%
Other values (2) 27
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 913796
15.4%
e 542108
9.1%
o 539871
9.1%
n 524553
8.8%
i 502603
 
8.5%
r 415682
 
7.0%
l 354237
 
6.0%
t 336123
 
5.7%
s 280758
 
4.7%
u 248244
 
4.2%
Other values (36) 1281817
21.6%
Uppercase Letter
ValueCountFrequency (%)
S 138100
 
10.9%
C 105786
 
8.3%
M 93660
 
7.4%
B 86701
 
6.8%
P 86250
 
6.8%
R 83615
 
6.6%
L 71730
 
5.7%
N 65469
 
5.2%
I 56770
 
4.5%
A 52214
 
4.1%
Other values (17) 427269
33.7%
Other Punctuation
ValueCountFrequency (%)
, 231908
87.5%
. 22167
 
8.4%
' 6055
 
2.3%
? 1337
 
0.5%
/ 958
 
0.4%
" 816
 
0.3%
: 659
 
0.2%
# 444
 
0.2%
& 364
 
0.1%
; 283
 
0.1%
Other values (3) 10
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 23097
22.0%
5 17642
16.8%
2 14487
13.8%
0 13388
12.7%
3 8220
 
7.8%
4 6775
 
6.4%
8 6256
 
5.9%
7 5786
 
5.5%
6 5236
 
5.0%
9 4333
 
4.1%
Math Symbol
ValueCountFrequency (%)
= 4291
98.6%
+ 56
 
1.3%
~ 7
 
0.2%
Open Punctuation
ValueCountFrequency (%)
( 3168
61.4%
[ 1992
38.6%
{ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 3167
61.4%
] 1990
38.6%
} 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 11354
> 99.9%
1
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
12
80.0%
3
 
20.0%
Space Separator
ValueCountFrequency (%)
929876
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 44
100.0%
Initial Punctuation
ValueCountFrequency (%)
12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7207356
84.5%
Common 1326196
 
15.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 913796
 
12.7%
e 542108
 
7.5%
o 539871
 
7.5%
n 524553
 
7.3%
i 502603
 
7.0%
r 415682
 
5.8%
l 354237
 
4.9%
t 336123
 
4.7%
s 280758
 
3.9%
u 248244
 
3.4%
Other values (63) 2549381
35.4%
Common
ValueCountFrequency (%)
929876
70.1%
, 231908
 
17.5%
1 23097
 
1.7%
. 22167
 
1.7%
5 17642
 
1.3%
2 14487
 
1.1%
0 13388
 
1.0%
- 11354
 
0.9%
3 8220
 
0.6%
4 6775
 
0.5%
Other values (29) 47282
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8533180
> 99.9%
None 344
 
< 0.1%
Punctuation 28
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
929876
 
10.9%
a 913796
 
10.7%
e 542108
 
6.4%
o 539871
 
6.3%
n 524553
 
6.1%
i 502603
 
5.9%
r 415682
 
4.9%
l 354237
 
4.2%
t 336123
 
3.9%
s 280758
 
3.3%
Other values (77) 3193573
37.4%
None
ValueCountFrequency (%)
ñ 80
23.3%
ô 69
20.1%
á 58
16.9%
í 35
10.2%
ā 21
 
6.1%
é 17
 
4.9%
ã 13
 
3.8%
è 10
 
2.9%
ú 9
 
2.6%
ö 8
 
2.3%
Other values (11) 24
 
7.0%
Punctuation
ValueCountFrequency (%)
12
42.9%
12
42.9%
3
 
10.7%
1
 
3.6%
Distinct1234
Distinct (%)1.4%
Missing498025
Missing (%)85.2%
Memory size4.5 MiB
2025-01-14T11:49:46.883901image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length6
Mean length5.434495824
Min length3

Characters and Unicode

Total characters470448
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique380 ?
Unique (%)0.4%

Sample

1st row1040.0
2nd row655.0
3rd row1524.0
4th row30.0
5th row220.0
ValueCountFrequency (%)
1829.0 2622
 
3.0%
914.0 2404
 
2.8%
1219.0 2324
 
2.7%
610.0 2187
 
2.5%
1524.0 2057
 
2.4%
1676.0 2012
 
2.3%
305.0 1843
 
2.1%
2134.0 1786
 
2.1%
1067.0 1655
 
1.9%
152.0 1483
 
1.7%
Other values (1223) 66194
76.5%
2025-01-14T11:49:47.158322image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 128131
27.2%
. 86567
18.4%
1 59889
12.7%
2 38751
 
8.2%
5 26492
 
5.6%
3 25784
 
5.5%
4 22371
 
4.8%
6 22225
 
4.7%
7 21405
 
4.5%
9 19560
 
4.2%
Other values (2) 19273
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 383879
81.6%
Other Punctuation 86567
 
18.4%
Dash Punctuation 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 128131
33.4%
1 59889
15.6%
2 38751
 
10.1%
5 26492
 
6.9%
3 25784
 
6.7%
4 22371
 
5.8%
6 22225
 
5.8%
7 21405
 
5.6%
9 19560
 
5.1%
8 19271
 
5.0%
Other Punctuation
ValueCountFrequency (%)
. 86567
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 470448
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 128131
27.2%
. 86567
18.4%
1 59889
12.7%
2 38751
 
8.2%
5 26492
 
5.6%
3 25784
 
5.5%
4 22371
 
4.8%
6 22225
 
4.7%
7 21405
 
4.5%
9 19560
 
4.2%
Other values (2) 19273
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 470448
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 128131
27.2%
. 86567
18.4%
1 59889
12.7%
2 38751
 
8.2%
5 26492
 
5.6%
3 25784
 
5.5%
4 22371
 
4.8%
6 22225
 
4.7%
7 21405
 
4.5%
9 19560
 
4.2%
Other values (2) 19273
 
4.1%
Distinct159
Distinct (%)1.6%
Missing574727
Missing (%)98.3%
Memory size4.5 MiB
2025-01-14T11:49:47.316798image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length6
Mean length5.59138368
Min length4

Characters and Unicode

Total characters55159
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16 ?
Unique (%)0.2%

Sample

1st row76.0
2nd row1219.0
3rd row3200.0
4th row2743.0
5th row2438.0
ValueCountFrequency (%)
305.0 1116
 
11.3%
1219.0 787
 
8.0%
1524.0 446
 
4.5%
1981.0 429
 
4.3%
762.0 402
 
4.1%
2743.0 345
 
3.5%
1372.0 302
 
3.1%
1676.0 257
 
2.6%
610.0 249
 
2.5%
1829.0 195
 
2.0%
Other values (149) 5337
54.1%
2025-01-14T11:49:47.532287image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 14091
25.5%
. 9865
17.9%
1 6344
11.5%
2 5370
 
9.7%
3 4072
 
7.4%
5 3319
 
6.0%
6 2683
 
4.9%
7 2580
 
4.7%
9 2427
 
4.4%
4 2353
 
4.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 45294
82.1%
Other Punctuation 9865
 
17.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 14091
31.1%
1 6344
14.0%
2 5370
 
11.9%
3 4072
 
9.0%
5 3319
 
7.3%
6 2683
 
5.9%
7 2580
 
5.7%
9 2427
 
5.4%
4 2353
 
5.2%
8 2055
 
4.5%
Other Punctuation
ValueCountFrequency (%)
. 9865
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 55159
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 14091
25.5%
. 9865
17.9%
1 6344
11.5%
2 5370
 
9.7%
3 4072
 
7.4%
5 3319
 
6.0%
6 2683
 
4.9%
7 2580
 
4.7%
9 2427
 
4.4%
4 2353
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 55159
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 14091
25.5%
. 9865
17.9%
1 6344
11.5%
2 5370
 
9.7%
3 4072
 
7.4%
5 3319
 
6.0%
6 2683
 
4.9%
7 2580
 
4.7%
9 2427
 
4.4%
4 2353
 
4.3%

verbatimElevation
Text

Missing 

Distinct196
Distinct (%)15.4%
Missing583323
Missing (%)99.8%
Memory size4.5 MiB
2025-01-14T11:49:47.640810image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length84
Median length9
Mean length13.72813239
Min length3

Characters and Unicode

Total characters17421
Distinct characters55
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique108 ?
Unique (%)8.5%

Sample

1st rowaltitude uncertain: label says both 5500 ft and 7000 ft
2nd rowca. 1050 m
3rd rowca. 4000 ft
4th rowsea level
5th row6230 ft
ValueCountFrequency (%)
sea 769
20.9%
level 769
20.9%
ft 409
11.1%
ca 177
 
4.8%
m 115
 
3.1%
says 114
 
3.1%
label 100
 
2.7%
altitude 92
 
2.5%
uncertain 74
 
2.0%
of 67
 
1.8%
Other values (170) 986
26.9%
2025-01-14T11:49:47.817454image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2820
16.2%
2403
13.8%
l 1955
11.2%
a 1546
8.9%
0 1357
 
7.8%
s 1076
 
6.2%
t 881
 
5.1%
v 812
 
4.7%
f 520
 
3.0%
n 353
 
2.0%
Other values (45) 3698
21.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12005
68.9%
Decimal Number 2407
 
13.8%
Space Separator 2403
 
13.8%
Other Punctuation 415
 
2.4%
Math Symbol 88
 
0.5%
Dash Punctuation 72
 
0.4%
Uppercase Letter 27
 
0.2%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2820
23.5%
l 1955
16.3%
a 1546
12.9%
s 1076
 
9.0%
t 881
 
7.3%
v 812
 
6.8%
f 520
 
4.3%
n 353
 
2.9%
c 298
 
2.5%
i 282
 
2.3%
Other values (14) 1462
12.2%
Decimal Number
ValueCountFrequency (%)
0 1357
56.4%
1 245
 
10.2%
5 213
 
8.8%
6 133
 
5.5%
2 119
 
4.9%
3 102
 
4.2%
8 92
 
3.8%
9 57
 
2.4%
4 54
 
2.2%
7 35
 
1.5%
Uppercase Letter
ValueCountFrequency (%)
S 9
33.3%
L 8
29.6%
E 4
14.8%
A 2
 
7.4%
O 1
 
3.7%
C 1
 
3.7%
I 1
 
3.7%
B 1
 
3.7%
Other Punctuation
ValueCountFrequency (%)
. 236
56.9%
: 99
23.9%
, 55
 
13.3%
? 17
 
4.1%
" 4
 
1.0%
; 4
 
1.0%
Math Symbol
ValueCountFrequency (%)
< 34
38.6%
> 33
37.5%
+ 21
23.9%
Space Separator
ValueCountFrequency (%)
2403
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 72
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12032
69.1%
Common 5389
30.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2820
23.4%
l 1955
16.2%
a 1546
12.8%
s 1076
 
8.9%
t 881
 
7.3%
v 812
 
6.7%
f 520
 
4.3%
n 353
 
2.9%
c 298
 
2.5%
i 282
 
2.3%
Other values (22) 1489
12.4%
Common
ValueCountFrequency (%)
2403
44.6%
0 1357
25.2%
1 245
 
4.5%
. 236
 
4.4%
5 213
 
4.0%
6 133
 
2.5%
2 119
 
2.2%
3 102
 
1.9%
: 99
 
1.8%
8 92
 
1.7%
Other values (13) 390
 
7.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17421
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2820
16.2%
2403
13.8%
l 1955
11.2%
a 1546
8.9%
0 1357
 
7.8%
s 1076
 
6.2%
t 881
 
5.1%
v 812
 
4.7%
f 520
 
3.0%
n 353
 
2.0%
Other values (45) 3698
21.2%

decimalLatitude
Text

Missing 

Distinct3288
Distinct (%)11.7%
Missing556582
Missing (%)95.2%
Memory size4.5 MiB
2025-01-14T11:49:48.012286image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length5.237950732
Min length3

Characters and Unicode

Total characters146715
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1422 ?
Unique (%)5.1%

Sample

1st row38.4236
2nd row5.85
3rd row7.97
4th row10.52
5th row0.35
ValueCountFrequency (%)
34.9606 991
 
3.5%
31.5011 663
 
2.4%
9.03 592
 
2.1%
8.25 507
 
1.8%
6.45 506
 
1.8%
29.3467 473
 
1.7%
3.65 448
 
1.6%
6.17 374
 
1.3%
12.63 310
 
1.1%
68.13 307
 
1.1%
Other values (3002) 22839
81.5%
2025-01-14T11:49:48.265770image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 28010
19.1%
3 14831
10.1%
1 14391
9.8%
5 12116
8.3%
6 11692
8.0%
8 11017
 
7.5%
4 10578
 
7.2%
7 10361
 
7.1%
2 9852
 
6.7%
0 9608
 
6.5%
Other values (2) 14259
9.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 112580
76.7%
Other Punctuation 28010
 
19.1%
Dash Punctuation 6125
 
4.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 14831
13.2%
1 14391
12.8%
5 12116
10.8%
6 11692
10.4%
8 11017
9.8%
4 10578
9.4%
7 10361
9.2%
2 9852
8.8%
0 9608
8.5%
9 8134
7.2%
Other Punctuation
ValueCountFrequency (%)
. 28010
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6125
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 146715
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 28010
19.1%
3 14831
10.1%
1 14391
9.8%
5 12116
8.3%
6 11692
8.0%
8 11017
 
7.5%
4 10578
 
7.2%
7 10361
 
7.1%
2 9852
 
6.7%
0 9608
 
6.5%
Other values (2) 14259
9.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 146715
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 28010
19.1%
3 14831
10.1%
1 14391
9.8%
5 12116
8.3%
6 11692
8.0%
8 11017
 
7.5%
4 10578
 
7.2%
7 10361
 
7.1%
2 9852
 
6.7%
0 9608
 
6.5%
Other values (2) 14259
9.7%

decimalLongitude
Text

Missing 

Distinct3647
Distinct (%)13.0%
Missing556582
Missing (%)95.2%
Memory size4.5 MiB
2025-01-14T11:49:48.470193image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length6.170867547
Min length3

Characters and Unicode

Total characters172846
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1662 ?
Unique (%)5.9%

Sample

1st row-79.2803
2nd row116.08
3rd row-73.95
4th row-75.02
5th row-176.53
ValueCountFrequency (%)
69.2778 991
 
3.5%
65.8453 663
 
2.4%
36.15 546
 
1.9%
38.18 502
 
1.8%
47.5206 473
 
1.7%
34.58 464
 
1.7%
52.37 452
 
1.6%
37.5 368
 
1.3%
165.95 307
 
1.1%
74.08 295
 
1.1%
Other values (3510) 22949
81.9%
2025-01-14T11:49:48.721061image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 28010
16.2%
7 19303
11.2%
1 16353
9.5%
- 15594
9.0%
3 14186
8.2%
5 13851
8.0%
2 12908
7.5%
6 12740
7.4%
8 12372
7.2%
9 10011
 
5.8%
Other values (2) 17518
10.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 129242
74.8%
Other Punctuation 28010
 
16.2%
Dash Punctuation 15594
 
9.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 19303
14.9%
1 16353
12.7%
3 14186
11.0%
5 13851
10.7%
2 12908
10.0%
6 12740
9.9%
8 12372
9.6%
9 10011
7.7%
0 8858
6.9%
4 8660
6.7%
Other Punctuation
ValueCountFrequency (%)
. 28010
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15594
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 172846
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 28010
16.2%
7 19303
11.2%
1 16353
9.5%
- 15594
9.0%
3 14186
8.2%
5 13851
8.0%
2 12908
7.5%
6 12740
7.4%
8 12372
7.2%
9 10011
 
5.8%
Other values (2) 17518
10.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 172846
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 28010
16.2%
7 19303
11.2%
1 16353
9.5%
- 15594
9.0%
3 14186
8.2%
5 13851
8.0%
2 12908
7.5%
6 12740
7.4%
8 12372
7.2%
9 10011
 
5.8%
Other values (2) 17518
10.1%

geodeticDatum
Text

Missing 

Distinct2
Distinct (%)0.6%
Missing584234
Missing (%)99.9%
Memory size4.5 MiB
2025-01-14T11:49:48.781406image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length18
Mean length17.95810056
Min length17

Characters and Unicode

Total characters6429
Distinct characters18
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWGS 84 (EPSG:4326)
2nd rowWGS 84 (EPSG:4326)
3rd rowWGS 84 (EPSG:4326)
4th rowWGS 84 (EPSG:4326)
5th rowWGS 84 (EPSG:4326)
ValueCountFrequency (%)
wgs 343
32.4%
84 343
32.4%
epsg:4326 343
32.4%
nad83 15
 
1.4%
epsg:4269 15
 
1.4%
2025-01-14T11:49:48.894626image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
S 701
10.9%
701
10.9%
4 701
10.9%
G 701
10.9%
: 358
 
5.6%
) 358
 
5.6%
8 358
 
5.6%
( 358
 
5.6%
E 358
 
5.6%
P 358
 
5.6%
Other values (8) 1477
23.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2506
39.0%
Decimal Number 2148
33.4%
Space Separator 701
 
10.9%
Other Punctuation 358
 
5.6%
Close Punctuation 358
 
5.6%
Open Punctuation 358
 
5.6%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 701
28.0%
G 701
28.0%
E 358
14.3%
P 358
14.3%
W 343
13.7%
N 15
 
0.6%
A 15
 
0.6%
D 15
 
0.6%
Decimal Number
ValueCountFrequency (%)
4 701
32.6%
8 358
16.7%
3 358
16.7%
2 358
16.7%
6 358
16.7%
9 15
 
0.7%
Space Separator
ValueCountFrequency (%)
701
100.0%
Other Punctuation
ValueCountFrequency (%)
: 358
100.0%
Close Punctuation
ValueCountFrequency (%)
) 358
100.0%
Open Punctuation
ValueCountFrequency (%)
( 358
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3923
61.0%
Latin 2506
39.0%

Most frequent character per script

Common
ValueCountFrequency (%)
701
17.9%
4 701
17.9%
: 358
9.1%
) 358
9.1%
8 358
9.1%
( 358
9.1%
3 358
9.1%
2 358
9.1%
6 358
9.1%
9 15
 
0.4%
Latin
ValueCountFrequency (%)
S 701
28.0%
G 701
28.0%
E 358
14.3%
P 358
14.3%
W 343
13.7%
N 15
 
0.6%
A 15
 
0.6%
D 15
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6429
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 701
10.9%
701
10.9%
4 701
10.9%
G 701
10.9%
: 358
 
5.6%
) 358
 
5.6%
8 358
 
5.6%
( 358
 
5.6%
E 358
 
5.6%
P 358
 
5.6%
Other values (8) 1477
23.0%

verbatimLatitude
Text

Missing 

Distinct3459
Distinct (%)15.2%
Missing561806
Missing (%)96.1%
Memory size4.5 MiB
2025-01-14T11:49:49.075497image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length10
Mean length8.893180023
Min length2

Characters and Unicode

Total characters202640
Distinct characters20
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1703 ?
Unique (%)7.5%

Sample

1st row38 25 25 N
2nd row05 51 -- N
3rd row0021--N
4th row17 53 -- N
5th row03 19 -- N
ValueCountFrequency (%)
n 12411
 
18.1%
8372
 
12.2%
s 4462
 
6.5%
05 1518
 
2.2%
08 1304
 
1.9%
27 1303
 
1.9%
34 1214
 
1.8%
38 1142
 
1.7%
00 1076
 
1.6%
57 1070
 
1.6%
Other values (2075) 34712
50.6%
2025-01-14T11:49:49.319864image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
45798
22.6%
- 26958
13.3%
0 19702
9.7%
N 16646
 
8.2%
3 13122
 
6.5%
1 11460
 
5.7%
5 11189
 
5.5%
4 10611
 
5.2%
2 10534
 
5.2%
8 7704
 
3.8%
Other values (10) 28916
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 103591
51.1%
Space Separator 45798
22.6%
Dash Punctuation 26958
 
13.3%
Uppercase Letter 22261
 
11.0%
Other Punctuation 4031
 
2.0%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 19702
19.0%
3 13122
12.7%
1 11460
11.1%
5 11189
10.8%
4 10611
10.2%
2 10534
10.2%
8 7704
 
7.4%
6 7319
 
7.1%
7 6250
 
6.0%
9 5700
 
5.5%
Other Punctuation
ValueCountFrequency (%)
. 3406
84.5%
' 612
 
15.2%
" 7
 
0.2%
/ 6
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
N 16646
74.8%
S 5610
 
25.2%
W 5
 
< 0.1%
Space Separator
ValueCountFrequency (%)
45798
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 26958
100.0%
Other Symbol
ValueCountFrequency (%)
° 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 180379
89.0%
Latin 22261
 
11.0%

Most frequent character per script

Common
ValueCountFrequency (%)
45798
25.4%
- 26958
14.9%
0 19702
10.9%
3 13122
 
7.3%
1 11460
 
6.4%
5 11189
 
6.2%
4 10611
 
5.9%
2 10534
 
5.8%
8 7704
 
4.3%
6 7319
 
4.1%
Other values (7) 15982
 
8.9%
Latin
ValueCountFrequency (%)
N 16646
74.8%
S 5610
 
25.2%
W 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 202639
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
45798
22.6%
- 26958
13.3%
0 19702
9.7%
N 16646
 
8.2%
3 13122
 
6.5%
1 11460
 
5.7%
5 11189
 
5.5%
4 10611
 
5.2%
2 10534
 
5.2%
8 7704
 
3.8%
Other values (9) 28915
14.3%
None
ValueCountFrequency (%)
° 1
100.0%

verbatimLongitude
Text

Missing 

Distinct3497
Distinct (%)16.1%
Missing562895
Missing (%)96.3%
Memory size4.5 MiB
2025-01-14T11:49:49.502137image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length13
Mean length9.713554869
Min length4

Characters and Unicode

Total characters210755
Distinct characters20
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1738 ?
Unique (%)8.0%

Sample

1st row079 16 49 W
2nd row116 05 -- E
3rd row17632--W
4th row082 12 -- E
5th row39 28.82 E
ValueCountFrequency (%)
e 9827
 
14.8%
8668
 
13.1%
w 6456
 
9.7%
37 1471
 
2.2%
40 1114
 
1.7%
16 1052
 
1.6%
69 964
 
1.5%
00 932
 
1.4%
39 783
 
1.2%
30 738
 
1.1%
Other values (2292) 34294
51.7%
2025-01-14T11:49:49.862025image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
44602
21.2%
- 26052
12.4%
0 21381
10.1%
1 17766
 
8.4%
3 13046
 
6.2%
E 10948
 
5.2%
2 10480
 
5.0%
4 10071
 
4.8%
W 10038
 
4.8%
6 9756
 
4.6%
Other values (10) 36615
17.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 115291
54.7%
Space Separator 44602
 
21.2%
Dash Punctuation 26052
 
12.4%
Uppercase Letter 20995
 
10.0%
Other Punctuation 3814
 
1.8%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 21381
18.5%
1 17766
15.4%
3 13046
11.3%
2 10480
9.1%
4 10071
8.7%
6 9756
8.5%
5 9680
8.4%
7 9555
8.3%
9 8505
 
7.4%
8 5051
 
4.4%
Uppercase Letter
ValueCountFrequency (%)
E 10948
52.1%
W 10038
47.8%
N 7
 
< 0.1%
S 2
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 3373
88.4%
' 434
 
11.4%
" 7
 
0.2%
Space Separator
ValueCountFrequency (%)
44602
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 26052
100.0%
Other Symbol
ValueCountFrequency (%)
° 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 189760
90.0%
Latin 20995
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
44602
23.5%
- 26052
13.7%
0 21381
11.3%
1 17766
 
9.4%
3 13046
 
6.9%
2 10480
 
5.5%
4 10071
 
5.3%
6 9756
 
5.1%
5 9680
 
5.1%
7 9555
 
5.0%
Other values (6) 17371
 
9.2%
Latin
ValueCountFrequency (%)
E 10948
52.1%
W 10038
47.8%
N 7
 
< 0.1%
S 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 210754
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
44602
21.2%
- 26052
12.4%
0 21381
10.1%
1 17766
 
8.4%
3 13046
 
6.2%
E 10948
 
5.2%
2 10480
 
5.0%
4 10071
 
4.8%
W 10038
 
4.8%
6 9756
 
4.6%
Other values (9) 36614
17.4%
None
ValueCountFrequency (%)
° 1
100.0%
Distinct4
Distinct (%)< 0.1%
Missing567281
Missing (%)97.0%
Memory size4.5 MiB
2025-01-14T11:49:49.920048image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length23
Mean length22.88076945
Min length3

Characters and Unicode

Total characters396089
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowDegrees Minutes Seconds
2nd rowDegrees Minutes Seconds
3rd rowDegrees Minutes Seconds
4th rowDegrees Minutes Seconds
5th rowDegrees Minutes Seconds
ValueCountFrequency (%)
degrees 17208
33.3%
minutes 17206
33.3%
seconds 17206
33.3%
utm 100
 
0.2%
unknown 3
 
< 0.1%
decimal 2
 
< 0.1%
2025-01-14T11:49:50.032536image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 86038
21.7%
s 51620
13.0%
n 34421
 
8.7%
34414
 
8.7%
M 17306
 
4.4%
o 17209
 
4.3%
D 17208
 
4.3%
c 17208
 
4.3%
g 17208
 
4.3%
r 17208
 
4.3%
Other values (12) 86249
21.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 309752
78.2%
Uppercase Letter 51923
 
13.1%
Space Separator 34414
 
8.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 86038
27.8%
s 51620
16.7%
n 34421
11.1%
o 17209
 
5.6%
c 17208
 
5.6%
g 17208
 
5.6%
r 17208
 
5.6%
i 17208
 
5.6%
d 17208
 
5.6%
t 17206
 
5.6%
Other values (6) 17218
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
M 17306
33.3%
D 17208
33.1%
S 17206
33.1%
U 103
 
0.2%
T 100
 
0.2%
Space Separator
ValueCountFrequency (%)
34414
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 361675
91.3%
Common 34414
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 86038
23.8%
s 51620
14.3%
n 34421
9.5%
M 17306
 
4.8%
o 17209
 
4.8%
D 17208
 
4.8%
c 17208
 
4.8%
g 17208
 
4.8%
r 17208
 
4.8%
i 17208
 
4.8%
Other values (11) 69041
19.1%
Common
ValueCountFrequency (%)
34414
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 396089
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 86038
21.7%
s 51620
13.0%
n 34421
 
8.7%
34414
 
8.7%
M 17306
 
4.4%
o 17209
 
4.3%
D 17208
 
4.3%
c 17208
 
4.3%
g 17208
 
4.3%
r 17208
 
4.3%
Other values (12) 86249
21.8%

georeferenceProtocol
Text

Missing 

Distinct11
Distinct (%)0.9%
Missing583342
Missing (%)99.8%
Memory size4.5 MiB
2025-01-14T11:49:50.088094image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length3
Mean length7.1184
Min length3

Characters and Unicode

Total characters8898
Distinct characters33
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)0.2%

Sample

1st rowGEOLocate tool
2nd rowGPS
3rd rowGoogle Earth maps
4th rowGPS
5th rowGPS
ValueCountFrequency (%)
gps 739
39.4%
earth 195
 
10.4%
maps 195
 
10.4%
google 195
 
10.4%
geolocate 179
 
9.6%
tool 179
 
9.6%
map 109
 
5.8%
online 18
 
1.0%
recorded 15
 
0.8%
not 15
 
0.8%
Other values (7) 35
 
1.9%
2025-01-14T11:49:50.205565image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
G 1114
12.5%
o 988
11.1%
P 739
 
8.3%
S 739
 
8.3%
a 700
 
7.9%
624
 
7.0%
t 582
 
6.5%
e 436
 
4.9%
l 413
 
4.6%
E 374
 
4.2%
Other values (23) 2189
24.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4838
54.4%
Uppercase Letter 3436
38.6%
Space Separator 624
 
7.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 988
20.4%
a 700
14.5%
t 582
12.0%
e 436
9.0%
l 413
8.5%
p 306
 
6.3%
m 244
 
5.0%
r 237
 
4.9%
c 205
 
4.2%
g 195
 
4.0%
Other values (12) 532
11.0%
Uppercase Letter
ValueCountFrequency (%)
G 1114
32.4%
P 739
21.5%
S 739
21.5%
E 374
 
10.9%
O 179
 
5.2%
L 179
 
5.2%
M 81
 
2.4%
U 11
 
0.3%
C 10
 
0.3%
T 10
 
0.3%
Space Separator
ValueCountFrequency (%)
624
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8274
93.0%
Common 624
 
7.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 1114
13.5%
o 988
11.9%
P 739
 
8.9%
S 739
 
8.9%
a 700
 
8.5%
t 582
 
7.0%
e 436
 
5.3%
l 413
 
5.0%
E 374
 
4.5%
p 306
 
3.7%
Other values (22) 1883
22.8%
Common
ValueCountFrequency (%)
624
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8898
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
G 1114
12.5%
o 988
11.1%
P 739
 
8.3%
S 739
 
8.3%
a 700
 
7.9%
624
 
7.0%
t 582
 
6.5%
e 436
 
4.9%
l 413
 
4.6%
E 374
 
4.2%
Other values (23) 2189
24.6%
Distinct5
Distinct (%)0.7%
Missing583894
Missing (%)99.9%
Memory size4.5 MiB
2025-01-14T11:49:50.254943image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length9
Mean length8.736389685
Min length3

Characters and Unicode

Total characters6098
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st rowuncertain
2nd rowuncertain
3rd rowuncertain
4th rowuncertain
5th rowuncertain
ValueCountFrequency (%)
uncertain 663
94.3%
cf 29
 
4.1%
sp 4
 
0.6%
aff 4
 
0.6%
near 2
 
0.3%
vel 1
 
0.1%
2025-01-14T11:49:50.354079image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 1328
21.8%
c 692
11.3%
a 669
11.0%
e 666
10.9%
r 665
10.9%
u 663
10.9%
t 663
10.9%
i 663
10.9%
f 37
 
0.6%
. 37
 
0.6%
Other values (5) 15
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6056
99.3%
Other Punctuation 37
 
0.6%
Space Separator 5
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 1328
21.9%
c 692
11.4%
a 669
11.0%
e 666
11.0%
r 665
11.0%
u 663
10.9%
t 663
10.9%
i 663
10.9%
f 37
 
0.6%
s 4
 
0.1%
Other values (3) 6
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 37
100.0%
Space Separator
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6056
99.3%
Common 42
 
0.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 1328
21.9%
c 692
11.4%
a 669
11.0%
e 666
11.0%
r 665
11.0%
u 663
10.9%
t 663
10.9%
i 663
10.9%
f 37
 
0.6%
s 4
 
0.1%
Other values (3) 6
 
0.1%
Common
ValueCountFrequency (%)
. 37
88.1%
5
 
11.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6098
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 1328
21.8%
c 692
11.3%
a 669
11.0%
e 666
10.9%
r 665
10.9%
u 663
10.9%
t 663
10.9%
i 663
10.9%
f 37
 
0.6%
. 37
 
0.6%
Other values (5) 15
 
0.2%

typeStatus
Text

Missing 

Distinct10
Distinct (%)0.3%
Missing580614
Missing (%)99.3%
Memory size4.5 MiB
2025-01-14T11:49:50.403599image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length37
Median length4
Mean length4.667169432
Min length4

Characters and Unicode

Total characters18566
Distinct characters29
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st rowCotype
2nd rowType
3rd rowType
4th rowType
5th rowType
ValueCountFrequency (%)
type 2763
69.0%
cotype 1217
30.4%
possible 12
 
0.3%
probable 5
 
0.1%
fide 2
 
< 0.1%
m 2
 
< 0.1%
r 2
 
< 0.1%
browning 2
 
< 0.1%
lectotype 1
 
< 0.1%
2025-01-14T11:49:50.508069image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 4001
21.6%
y 3981
21.4%
p 3981
21.4%
T 2763
14.9%
o 1237
 
6.7%
t 1219
 
6.6%
C 1217
 
6.6%
28
 
0.2%
s 24
 
0.1%
b 22
 
0.1%
Other values (19) 93
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 14523
78.2%
Uppercase Letter 4004
 
21.6%
Space Separator 28
 
0.2%
Other Punctuation 7
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 4001
27.5%
y 3981
27.4%
p 3981
27.4%
o 1237
 
8.5%
t 1219
 
8.4%
s 24
 
0.2%
b 22
 
0.2%
l 17
 
0.1%
i 16
 
0.1%
r 7
 
< 0.1%
Other values (7) 18
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
T 2763
69.0%
C 1217
30.4%
P 17
 
0.4%
M 2
 
< 0.1%
R 2
 
< 0.1%
B 2
 
< 0.1%
L 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 4
57.1%
; 3
42.9%
Space Separator
ValueCountFrequency (%)
28
100.0%
Open Punctuation
ValueCountFrequency (%)
( 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18527
99.8%
Common 39
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 4001
21.6%
y 3981
21.5%
p 3981
21.5%
T 2763
14.9%
o 1237
 
6.7%
t 1219
 
6.6%
C 1217
 
6.6%
s 24
 
0.1%
b 22
 
0.1%
P 17
 
0.1%
Other values (14) 65
 
0.4%
Common
ValueCountFrequency (%)
28
71.8%
. 4
 
10.3%
; 3
 
7.7%
( 2
 
5.1%
) 2
 
5.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18566
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 4001
21.6%
y 3981
21.4%
p 3981
21.4%
T 2763
14.9%
o 1237
 
6.7%
t 1219
 
6.6%
C 1217
 
6.6%
28
 
0.2%
s 24
 
0.1%
b 22
 
0.1%
Other values (19) 93
 
0.5%

identifiedBy
Text

Missing 

Distinct69
Distinct (%)2.0%
Missing581206
Missing (%)99.4%
Memory size4.5 MiB
2025-01-14T11:49:50.631537image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length129
Median length18
Mean length24.97489663
Min length9

Characters and Unicode

Total characters84565
Distinct characters60
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22 ?
Unique (%)0.6%

Sample

1st rowWetmore, Alexander
2nd rowMaley, James M, Collections Manager, Occidental College - Moore Laboratory of Zoology (UNITED STATES)
3rd rowWetmore, Alexander
4th rowVerhelst, Juan C
5th rowClark, W. S.
ValueCountFrequency (%)
wetmore 2393
21.9%
alexander 2382
21.8%
of 294
 
2.7%
268
 
2.5%
united 266
 
2.4%
states 265
 
2.4%
museum 246
 
2.3%
history 200
 
1.8%
natural 200
 
1.8%
birds 198
 
1.8%
Other values (178) 4219
38.6%
2025-01-14T11:49:50.825800image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 11582
13.7%
7545
 
8.9%
r 6594
 
7.8%
a 5098
 
6.0%
o 5033
 
6.0%
t 4517
 
5.3%
n 4224
 
5.0%
l 4082
 
4.8%
, 3962
 
4.7%
m 3194
 
3.8%
Other values (50) 28734
34.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 57333
67.8%
Uppercase Letter 13878
 
16.4%
Space Separator 7545
 
8.9%
Other Punctuation 4577
 
5.4%
Close Punctuation 477
 
0.6%
Open Punctuation 477
 
0.6%
Dash Punctuation 270
 
0.3%
Decimal Number 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 11582
20.2%
r 6594
11.5%
a 5098
8.9%
o 5033
8.8%
t 4517
 
7.9%
n 4224
 
7.4%
l 4082
 
7.1%
m 3194
 
5.6%
d 2638
 
4.6%
x 2382
 
4.2%
Other values (16) 7989
13.9%
Uppercase Letter
ValueCountFrequency (%)
A 2751
19.8%
W 2686
19.4%
S 1048
 
7.6%
I 824
 
5.9%
T 808
 
5.8%
M 697
 
5.0%
N 690
 
5.0%
C 642
 
4.6%
D 607
 
4.4%
E 558
 
4.0%
Other values (14) 2567
18.5%
Decimal Number
ValueCountFrequency (%)
1 2
25.0%
9 2
25.0%
5 2
25.0%
0 2
25.0%
Other Punctuation
ValueCountFrequency (%)
, 3962
86.6%
. 615
 
13.4%
Space Separator
ValueCountFrequency (%)
7545
100.0%
Close Punctuation
ValueCountFrequency (%)
) 477
100.0%
Open Punctuation
ValueCountFrequency (%)
( 477
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 270
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 71211
84.2%
Common 13354
 
15.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 11582
16.3%
r 6594
 
9.3%
a 5098
 
7.2%
o 5033
 
7.1%
t 4517
 
6.3%
n 4224
 
5.9%
l 4082
 
5.7%
m 3194
 
4.5%
A 2751
 
3.9%
W 2686
 
3.8%
Other values (40) 21450
30.1%
Common
ValueCountFrequency (%)
7545
56.5%
, 3962
29.7%
. 615
 
4.6%
) 477
 
3.6%
( 477
 
3.6%
- 270
 
2.0%
1 2
 
< 0.1%
9 2
 
< 0.1%
5 2
 
< 0.1%
0 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 84564
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 11582
13.7%
7545
 
8.9%
r 6594
 
7.8%
a 5098
 
6.0%
o 5033
 
6.0%
t 4517
 
5.3%
n 4224
 
5.0%
l 4082
 
4.8%
, 3962
 
4.7%
m 3194
 
3.8%
Other values (49) 28733
34.0%
None
ValueCountFrequency (%)
à 1
100.0%
Distinct22061
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:51.031589image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length65
Median length50
Mean length23.69967259
Min length7

Characters and Unicode

Total characters13854639
Distinct characters58
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3436 ?
Unique (%)0.6%

Sample

1st rowParoaria capitata
2nd rowRostrhamus sociabilis
3rd rowBartramia longicauda
4th rowSterna hirundo
5th rowPrionochilus plateni
ValueCountFrequency (%)
dendroica 14826
 
1.0%
parus 7485
 
0.5%
melospiza 7103
 
0.5%
turdus 6813
 
0.5%
vireo 6404
 
0.4%
calidris 6376
 
0.4%
sterna 6184
 
0.4%
hyemalis 5963
 
0.4%
melodia 5927
 
0.4%
carduelis 5742
 
0.4%
Other values (10903) 1419872
95.1%
2025-01-14T11:49:51.310479image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1477651
 
10.7%
i 1303096
 
9.4%
s 1190344
 
8.6%
r 934012
 
6.7%
908103
 
6.6%
e 885911
 
6.4%
u 853994
 
6.2%
o 821323
 
5.9%
l 776498
 
5.6%
n 730705
 
5.3%
Other values (48) 3973002
28.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12360285
89.2%
Space Separator 908103
 
6.6%
Uppercase Letter 584699
 
4.2%
Other Punctuation 1511
 
< 0.1%
Dash Punctuation 41
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1477651
12.0%
i 1303096
10.5%
s 1190344
9.6%
r 934012
 
7.6%
e 885911
 
7.2%
u 853994
 
6.9%
o 821323
 
6.6%
l 776498
 
6.3%
n 730705
 
5.9%
c 671384
 
5.4%
Other values (16) 2715367
22.0%
Uppercase Letter
ValueCountFrequency (%)
C 92584
15.8%
P 87973
15.0%
A 57125
9.8%
S 48743
8.3%
M 44873
 
7.7%
T 42452
 
7.3%
D 28042
 
4.8%
L 25741
 
4.4%
E 22719
 
3.9%
G 16764
 
2.9%
Other values (16) 117683
20.1%
Other Punctuation
ValueCountFrequency (%)
. 1121
74.2%
" 348
 
23.0%
/ 37
 
2.4%
? 5
 
0.3%
Space Separator
ValueCountFrequency (%)
908103
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 41
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12944984
93.4%
Common 909655
 
6.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1477651
11.4%
i 1303096
 
10.1%
s 1190344
 
9.2%
r 934012
 
7.2%
e 885911
 
6.8%
u 853994
 
6.6%
o 821323
 
6.3%
l 776498
 
6.0%
n 730705
 
5.6%
c 671384
 
5.2%
Other values (42) 3300066
25.5%
Common
ValueCountFrequency (%)
908103
99.8%
. 1121
 
0.1%
" 348
 
< 0.1%
- 41
 
< 0.1%
/ 37
 
< 0.1%
? 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13854639
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1477651
 
10.7%
i 1303096
 
9.4%
s 1190344
 
8.6%
r 934012
 
6.7%
908103
 
6.6%
e 885911
 
6.4%
u 853994
 
6.2%
o 821323
 
5.9%
l 776498
 
5.6%
n 730705
 
5.3%
Other values (48) 3973002
28.7%
Distinct185
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:51.490875image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length89
Median length78
Mean length65.97973972
Min length45

Characters and Unicode

Total characters38571228
Distinct characters47
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia, Chordata, Vertebrata, Aves, Passeriformes, Emberizidae, Emberizinae
2nd rowAnimalia, Chordata, Vertebrata, Aves, Falconiformes, Accipitridae
3rd rowAnimalia, Chordata, Vertebrata, Aves, Charadriiformes, Scolopacidae
4th rowAnimalia, Chordata, Vertebrata, Aves, Charadriiformes, Laridae
5th rowAnimalia, Chordata, Vertebrata, Aves, Passeriformes, Dicaeidae
ValueCountFrequency (%)
animalia 584592
16.0%
aves 584592
16.0%
chordata 584592
16.0%
vertebrata 584592
16.0%
passeriformes 372479
10.2%
emberizidae 72754
 
2.0%
emberizinae 50573
 
1.4%
charadriiformes 44080
 
1.2%
parulidae 36362
 
1.0%
tyrannidae 27497
 
0.8%
Other values (206) 702489
19.3%
2025-01-14T11:49:51.737078image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 5107701
13.2%
e 3704731
 
9.6%
r 3367631
 
8.7%
3060010
 
7.9%
, 3060009
 
7.9%
i 3035379
 
7.9%
s 1981990
 
5.1%
t 1944511
 
5.0%
o 1467327
 
3.8%
m 1357393
 
3.5%
Other values (37) 10484546
27.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 28806607
74.7%
Uppercase Letter 3644601
 
9.4%
Space Separator 3060010
 
7.9%
Other Punctuation 3060010
 
7.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 5107701
17.7%
e 3704731
12.9%
r 3367631
11.7%
i 3035379
10.5%
s 1981990
 
6.9%
t 1944511
 
6.8%
o 1467327
 
5.1%
m 1357393
 
4.7%
d 1347875
 
4.7%
n 998616
 
3.5%
Other values (13) 4493453
15.6%
Uppercase Letter
ValueCountFrequency (%)
A 1269397
34.8%
C 741968
20.4%
V 592119
16.2%
P 536883
14.7%
E 129534
 
3.6%
T 114606
 
3.1%
S 71084
 
2.0%
F 49758
 
1.4%
M 26183
 
0.7%
G 25043
 
0.7%
Other values (11) 88026
 
2.4%
Other Punctuation
ValueCountFrequency (%)
, 3060009
> 99.9%
? 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
3060010
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 32451208
84.1%
Common 6120020
 
15.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 5107701
15.7%
e 3704731
11.4%
r 3367631
10.4%
i 3035379
 
9.4%
s 1981990
 
6.1%
t 1944511
 
6.0%
o 1467327
 
4.5%
m 1357393
 
4.2%
d 1347875
 
4.2%
A 1269397
 
3.9%
Other values (34) 7867273
24.2%
Common
ValueCountFrequency (%)
3060010
50.0%
, 3060009
50.0%
? 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 38571228
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 5107701
13.2%
e 3704731
 
9.6%
r 3367631
 
8.7%
3060010
 
7.9%
, 3060009
 
7.9%
i 3035379
 
7.9%
s 1981990
 
5.1%
t 1944511
 
5.0%
o 1467327
 
3.8%
m 1357393
 
3.5%
Other values (37) 10484546
27.2%

kingdom
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:51.793309image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters4676736
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 584592
100.0%
2025-01-14T11:49:51.888458image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 1169184
25.0%
a 1169184
25.0%
A 584592
12.5%
n 584592
12.5%
m 584592
12.5%
l 584592
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4092144
87.5%
Uppercase Letter 584592
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1169184
28.6%
a 1169184
28.6%
n 584592
14.3%
m 584592
14.3%
l 584592
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 584592
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4676736
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1169184
25.0%
a 1169184
25.0%
A 584592
12.5%
n 584592
12.5%
m 584592
12.5%
l 584592
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4676736
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1169184
25.0%
a 1169184
25.0%
A 584592
12.5%
n 584592
12.5%
m 584592
12.5%
l 584592
12.5%

phylum
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:51.929474image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters4676736
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowChordata
2nd rowChordata
3rd rowChordata
4th rowChordata
5th rowChordata
ValueCountFrequency (%)
chordata 584592
100.0%
2025-01-14T11:49:52.023295image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1169184
25.0%
C 584592
12.5%
h 584592
12.5%
o 584592
12.5%
r 584592
12.5%
d 584592
12.5%
t 584592
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4092144
87.5%
Uppercase Letter 584592
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1169184
28.6%
h 584592
14.3%
o 584592
14.3%
r 584592
14.3%
d 584592
14.3%
t 584592
14.3%
Uppercase Letter
ValueCountFrequency (%)
C 584592
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4676736
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1169184
25.0%
C 584592
12.5%
h 584592
12.5%
o 584592
12.5%
r 584592
12.5%
d 584592
12.5%
t 584592
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4676736
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1169184
25.0%
C 584592
12.5%
h 584592
12.5%
o 584592
12.5%
r 584592
12.5%
d 584592
12.5%
t 584592
12.5%

class
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:52.066238image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2338368
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAves
2nd rowAves
3rd rowAves
4th rowAves
5th rowAves
ValueCountFrequency (%)
aves 584592
100.0%
2025-01-14T11:49:52.159516image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 584592
25.0%
v 584592
25.0%
e 584592
25.0%
s 584592
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1753776
75.0%
Uppercase Letter 584592
 
25.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
v 584592
33.3%
e 584592
33.3%
s 584592
33.3%
Uppercase Letter
ValueCountFrequency (%)
A 584592
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2338368
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 584592
25.0%
v 584592
25.0%
e 584592
25.0%
s 584592
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2338368
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 584592
25.0%
v 584592
25.0%
e 584592
25.0%
s 584592
25.0%

order
Text

Distinct23
Distinct (%)< 0.1%
Missing12
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-14T11:49:52.220540image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length13
Mean length12.92974443
Min length10

Characters and Unicode

Total characters7558470
Distinct characters26
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPasseriformes
2nd rowFalconiformes
3rd rowCharadriiformes
4th rowCharadriiformes
5th rowPasseriformes
ValueCountFrequency (%)
passeriformes 372479
63.7%
charadriiformes 44080
 
7.5%
piciformes 22599
 
3.9%
apodiformes 18145
 
3.1%
falconiformes 15873
 
2.7%
anseriformes 15668
 
2.7%
galliformes 14867
 
2.5%
columbiformes 13075
 
2.2%
coraciiformes 9455
 
1.6%
psittaciformes 7419
 
1.3%
Other values (13) 50920
 
8.7%
2025-01-14T11:49:52.346421image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 1351819
17.9%
r 1103423
14.6%
e 993432
13.1%
i 704760
9.3%
o 661430
8.8%
m 600232
7.9%
f 583174
7.7%
a 529233
 
7.0%
P 417687
 
5.5%
l 89937
 
1.2%
Other values (16) 523343
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6973890
92.3%
Uppercase Letter 584580
 
7.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 1351819
19.4%
r 1103423
15.8%
e 993432
14.2%
i 704760
10.1%
o 661430
9.5%
m 600232
8.6%
f 583174
8.4%
a 529233
 
7.6%
l 89937
 
1.3%
c 83254
 
1.2%
Other values (9) 273196
 
3.9%
Uppercase Letter
ValueCountFrequency (%)
P 417687
71.5%
C 84531
 
14.5%
A 33813
 
5.8%
G 22759
 
3.9%
F 15873
 
2.7%
S 7954
 
1.4%
T 1963
 
0.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 7558470
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 1351819
17.9%
r 1103423
14.6%
e 993432
13.1%
i 704760
9.3%
o 661430
8.8%
m 600232
7.9%
f 583174
7.7%
a 529233
 
7.0%
P 417687
 
5.5%
l 89937
 
1.2%
Other values (16) 523343
 
6.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7558470
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 1351819
17.9%
r 1103423
14.6%
e 993432
13.1%
i 704760
9.3%
o 661430
8.8%
m 600232
7.9%
f 583174
7.7%
a 529233
 
7.0%
P 417687
 
5.5%
l 89937
 
1.2%
Other values (16) 523343
 
6.9%

family
Text

Distinct170
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
2025-01-14T11:49:52.504179image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length15
Mean length10.09870987
Min length7

Characters and Unicode

Total characters5903625
Distinct characters45
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowEmberizidae
2nd rowAccipitridae
3rd rowScolopacidae
4th rowLaridae
5th rowDicaeidae
ValueCountFrequency (%)
emberizidae 72754
 
12.4%
parulidae 36362
 
6.2%
tyrannidae 27497
 
4.7%
turdidae 24865
 
4.3%
icteridae 19964
 
3.4%
picidae 17391
 
3.0%
scolopacidae 16648
 
2.8%
anatidae 15610
 
2.7%
fringillidae 15545
 
2.7%
sylviidae 15340
 
2.6%
Other values (160) 322617
55.2%
2025-01-14T11:49:52.724518image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 940552
15.9%
a 860854
14.6%
e 744168
12.6%
d 669425
11.3%
r 404257
 
6.8%
l 239327
 
4.1%
c 213294
 
3.6%
o 197967
 
3.4%
n 189947
 
3.2%
t 142863
 
2.4%
Other values (35) 1300971
22.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5319031
90.1%
Uppercase Letter 584592
 
9.9%
Space Separator 1
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 940552
17.7%
a 860854
16.2%
e 744168
14.0%
d 669425
12.6%
r 404257
7.6%
l 239327
 
4.5%
c 213294
 
4.0%
o 197967
 
3.7%
n 189947
 
3.6%
t 142863
 
2.7%
Other values (12) 716377
13.5%
Uppercase Letter
ValueCountFrequency (%)
P 113713
19.5%
T 91638
15.7%
E 78961
13.5%
A 50804
8.7%
C 50294
8.6%
S 49488
8.5%
F 33472
 
5.7%
M 23331
 
4.0%
I 21537
 
3.7%
L 19459
 
3.3%
Other values (11) 51895
8.9%
Space Separator
ValueCountFrequency (%)
1
100.0%
Other Punctuation
ValueCountFrequency (%)
? 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5903623
> 99.9%
Common 2
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 940552
15.9%
a 860854
14.6%
e 744168
12.6%
d 669425
11.3%
r 404257
 
6.8%
l 239327
 
4.1%
c 213294
 
3.6%
o 197967
 
3.4%
n 189947
 
3.2%
t 142863
 
2.4%
Other values (33) 1300969
22.0%
Common
ValueCountFrequency (%)
1
50.0%
? 1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5903625
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 940552
15.9%
a 860854
14.6%
e 744168
12.6%
d 669425
11.3%
r 404257
 
6.8%
l 239327
 
4.1%
c 213294
 
3.6%
o 197967
 
3.4%
n 189947
 
3.2%
t 142863
 
2.4%
Other values (35) 1300971
22.0%

genus
Text

Distinct2021
Distinct (%)0.3%
Missing208
Missing (%)< 0.1%
Memory size4.5 MiB
2025-01-14T11:49:52.931565image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length15
Mean length8.461662195
Min length3

Characters and Unicode

Total characters4944860
Distinct characters52
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique80 ?
Unique (%)< 0.1%

Sample

1st rowParoaria
2nd rowRostrhamus
3rd rowBartramia
4th rowSterna
5th rowPrionochilus
ValueCountFrequency (%)
dendroica 14825
 
2.5%
parus 7485
 
1.3%
melospiza 7103
 
1.2%
turdus 6813
 
1.2%
vireo 6403
 
1.1%
calidris 6372
 
1.1%
sterna 6184
 
1.1%
agelaius 5525
 
0.9%
carduelis 5507
 
0.9%
picoides 5086
 
0.9%
Other values (2011) 513081
87.8%
2025-01-14T11:49:53.198546image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 519684
 
10.5%
i 398130
 
8.1%
o 387018
 
7.8%
s 383052
 
7.7%
r 365698
 
7.4%
u 308787
 
6.2%
e 306933
 
6.2%
l 267420
 
5.4%
n 224162
 
4.5%
c 212754
 
4.3%
Other values (42) 1571222
31.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4360476
88.2%
Uppercase Letter 584384
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 519684
11.9%
i 398130
9.1%
o 387018
 
8.9%
s 383052
 
8.8%
r 365698
 
8.4%
u 308787
 
7.1%
e 306933
 
7.0%
l 267420
 
6.1%
n 224162
 
5.1%
c 212754
 
4.9%
Other values (16) 986838
22.6%
Uppercase Letter
ValueCountFrequency (%)
C 92539
15.8%
P 87929
15.0%
A 57065
9.8%
S 48723
8.3%
M 44863
 
7.7%
T 42420
 
7.3%
D 28038
 
4.8%
L 25718
 
4.4%
E 22718
 
3.9%
G 16758
 
2.9%
Other values (16) 117613
20.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 4944860
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 519684
 
10.5%
i 398130
 
8.1%
o 387018
 
7.8%
s 383052
 
7.7%
r 365698
 
7.4%
u 308787
 
6.2%
e 306933
 
6.2%
l 267420
 
5.4%
n 224162
 
4.5%
c 212754
 
4.3%
Other values (42) 1571222
31.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4944860
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 519684
 
10.5%
i 398130
 
8.1%
o 387018
 
7.8%
s 383052
 
7.7%
r 365698
 
7.4%
u 308787
 
6.2%
e 306933
 
6.2%
l 267420
 
5.4%
n 224162
 
4.5%
c 212754
 
4.3%
Other values (42) 1571222
31.8%
Distinct4700
Distinct (%)0.8%
Missing980
Missing (%)0.2%
Memory size4.5 MiB
2025-01-14T11:49:53.371306image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length17
Mean length8.77786783
Min length3

Characters and Unicode

Total characters5122869
Distinct characters28
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique320 ?
Unique (%)0.1%

Sample

1st rowcapitata
2nd rowsociabilis
3rd rowlongicauda
4th rowhirundo
5th rowplateni
ValueCountFrequency (%)
melodia 5111
 
0.9%
phoeniceus 4998
 
0.9%
hyemalis 4955
 
0.8%
americana 4691
 
0.8%
canadensis 3854
 
0.7%
sandwichensis 3774
 
0.6%
pusilla 3581
 
0.6%
alpestris 3383
 
0.6%
carolinensis 3309
 
0.6%
petechia 3061
 
0.5%
Other values (4690) 542895
93.0%
2025-01-14T11:49:53.606855image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 631378
12.3%
i 561482
11.0%
s 511762
10.0%
r 364696
 
7.1%
u 364611
 
7.1%
e 356787
 
7.0%
l 334832
 
6.5%
n 308110
 
6.0%
c 308036
 
6.0%
o 275389
 
5.4%
Other values (18) 1105786
21.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5121773
> 99.9%
Other Punctuation 1096
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 631378
12.3%
i 561482
11.0%
s 511762
10.0%
r 364696
 
7.1%
u 364611
 
7.1%
e 356787
 
7.0%
l 334832
 
6.5%
n 308110
 
6.0%
c 308036
 
6.0%
o 275389
 
5.4%
Other values (16) 1104690
21.6%
Other Punctuation
ValueCountFrequency (%)
. 1059
96.6%
/ 37
 
3.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 5121773
> 99.9%
Common 1096
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 631378
12.3%
i 561482
11.0%
s 511762
10.0%
r 364696
 
7.1%
u 364611
 
7.1%
e 356787
 
7.0%
l 334832
 
6.5%
n 308110
 
6.0%
c 308036
 
6.0%
o 275389
 
5.4%
Other values (16) 1104690
21.6%
Common
ValueCountFrequency (%)
. 1059
96.6%
/ 37
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5122869
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 631378
12.3%
i 561482
11.0%
s 511762
10.0%
r 364696
 
7.1%
u 364611
 
7.1%
e 356787
 
7.0%
l 334832
 
6.5%
n 308110
 
6.0%
c 308036
 
6.0%
o 275389
 
5.4%
Other values (18) 1105786
21.6%

infraspecificEpithet
Text

Missing 

Distinct7404
Distinct (%)2.3%
Missing268369
Missing (%)45.9%
Memory size4.5 MiB
2025-01-14T11:49:53.813263image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length16
Mean length8.927149512
Min length2

Characters and Unicode

Total characters2822970
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique868 ?
Unique (%)0.3%

Sample

1st rowsolitarius
2nd rowflavoolivaceus
3rd rowsatrapa
4th rowaustralis
5th rowmalherbii
ValueCountFrequency (%)
carolinensis 1803
 
0.6%
pusilla 1304
 
0.4%
pinus 1235
 
0.4%
frontalis 1217
 
0.4%
coronata 1200
 
0.4%
occidentalis 1189
 
0.4%
arizonae 1068
 
0.3%
olivaceus 1061
 
0.3%
flammea 1046
 
0.3%
hyemalis 1005
 
0.3%
Other values (7395) 304124
96.2%
2025-01-14T11:49:54.090757image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 337723
12.0%
a 320893
11.4%
s 290503
10.3%
e 217863
 
7.7%
r 199489
 
7.1%
n 194553
 
6.9%
u 177622
 
6.3%
l 170740
 
6.0%
o 156011
 
5.5%
c 147667
 
5.2%
Other values (20) 609906
21.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2822868
> 99.9%
Dash Punctuation 41
 
< 0.1%
Other Punctuation 32
 
< 0.1%
Space Separator 29
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 337723
12.0%
a 320893
11.4%
s 290503
10.3%
e 217863
 
7.7%
r 199489
 
7.1%
n 194553
 
6.9%
u 177622
 
6.3%
l 170740
 
6.0%
o 156011
 
5.5%
c 147667
 
5.2%
Other values (16) 609804
21.6%
Other Punctuation
ValueCountFrequency (%)
. 29
90.6%
? 3
 
9.4%
Dash Punctuation
ValueCountFrequency (%)
- 41
100.0%
Space Separator
ValueCountFrequency (%)
29
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2822868
> 99.9%
Common 102
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 337723
12.0%
a 320893
11.4%
s 290503
10.3%
e 217863
 
7.7%
r 199489
 
7.1%
n 194553
 
6.9%
u 177622
 
6.3%
l 170740
 
6.0%
o 156011
 
5.5%
c 147667
 
5.2%
Other values (16) 609804
21.6%
Common
ValueCountFrequency (%)
- 41
40.2%
. 29
28.4%
29
28.4%
? 3
 
2.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2822970
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 337723
12.0%
a 320893
11.4%
s 290503
10.3%
e 217863
 
7.7%
r 199489
 
7.1%
n 194553
 
6.9%
u 177622
 
6.3%
l 170740
 
6.0%
o 156011
 
5.5%
c 147667
 
5.2%
Other values (20) 609906
21.6%

taxonRank
Text

Constant  Missing 

Distinct1
Distinct (%)< 0.1%
Missing268369
Missing (%)45.9%
Memory size4.5 MiB
2025-01-14T11:49:54.150725image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters3162230
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsubspecies
2nd rowsubspecies
3rd rowsubspecies
4th rowsubspecies
5th rowsubspecies
ValueCountFrequency (%)
subspecies 316223
100.0%
2025-01-14T11:49:54.254672image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 948669
30.0%
e 632446
20.0%
u 316223
 
10.0%
b 316223
 
10.0%
p 316223
 
10.0%
c 316223
 
10.0%
i 316223
 
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3162230
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 948669
30.0%
e 632446
20.0%
u 316223
 
10.0%
b 316223
 
10.0%
p 316223
 
10.0%
c 316223
 
10.0%
i 316223
 
10.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3162230
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 948669
30.0%
e 632446
20.0%
u 316223
 
10.0%
b 316223
 
10.0%
p 316223
 
10.0%
c 316223
 
10.0%
i 316223
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3162230
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 948669
30.0%
e 632446
20.0%
u 316223
 
10.0%
b 316223
 
10.0%
p 316223
 
10.0%
c 316223
 
10.0%
i 316223
 
10.0%
Distinct148
Distinct (%)13.0%
Missing583452
Missing (%)99.8%
Memory size4.5 MiB
2025-01-14T11:49:54.401913image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length42
Median length29
Mean length7.978947368
Min length3

Characters and Unicode

Total characters9096
Distinct characters49
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique82 ?
Unique (%)7.2%

Sample

1st rowOlson
2nd rowRidgway in Baird et al.
3rd rowRidgway
4th rowRidgway
5th rowWetmore
ValueCountFrequency (%)
ridgway 309
22.4%
wetmore 118
 
8.6%
nelson 113
 
8.2%
deignan 85
 
6.2%
oberholser 79
 
5.7%
47
 
3.4%
phillips 42
 
3.0%
baird 40
 
2.9%
ripley 30
 
2.2%
riley 27
 
2.0%
Other values (134) 489
35.5%
2025-01-14T11:49:54.626988image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 969
 
10.7%
i 699
 
7.7%
a 650
 
7.1%
r 569
 
6.3%
n 568
 
6.2%
o 526
 
5.8%
l 495
 
5.4%
d 447
 
4.9%
g 424
 
4.7%
s 415
 
4.6%
Other values (39) 3334
36.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7469
82.1%
Uppercase Letter 1262
 
13.9%
Space Separator 239
 
2.6%
Other Punctuation 90
 
1.0%
Open Punctuation 18
 
0.2%
Close Punctuation 18
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 969
13.0%
i 699
9.4%
a 650
 
8.7%
r 569
 
7.6%
n 568
 
7.6%
o 526
 
7.0%
l 495
 
6.6%
d 447
 
6.0%
g 424
 
5.7%
s 415
 
5.6%
Other values (14) 1707
22.9%
Uppercase Letter
ValueCountFrequency (%)
R 391
31.0%
W 124
 
9.8%
N 117
 
9.3%
O 105
 
8.3%
D 99
 
7.8%
B 93
 
7.4%
P 67
 
5.3%
M 47
 
3.7%
S 30
 
2.4%
T 29
 
2.3%
Other values (10) 160
12.7%
Other Punctuation
ValueCountFrequency (%)
& 47
52.2%
. 43
47.8%
Space Separator
ValueCountFrequency (%)
239
100.0%
Open Punctuation
ValueCountFrequency (%)
( 18
100.0%
Close Punctuation
ValueCountFrequency (%)
) 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8731
96.0%
Common 365
 
4.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 969
 
11.1%
i 699
 
8.0%
a 650
 
7.4%
r 569
 
6.5%
n 568
 
6.5%
o 526
 
6.0%
l 495
 
5.7%
d 447
 
5.1%
g 424
 
4.9%
s 415
 
4.8%
Other values (34) 2969
34.0%
Common
ValueCountFrequency (%)
239
65.5%
& 47
 
12.9%
. 43
 
11.8%
( 18
 
4.9%
) 18
 
4.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9096
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 969
 
10.7%
i 699
 
7.7%
a 650
 
7.1%
r 569
 
6.3%
n 568
 
6.2%
o 526
 
5.8%
l 495
 
5.4%
d 447
 
4.9%
g 424
 
4.7%
s 415
 
4.6%
Other values (39) 3334
36.7%